2,173 research outputs found

    Bottom-Up and Top-Down Reasoning with Hierarchical Rectified Gaussians

    Full text link
    Convolutional neural nets (CNNs) have demonstrated remarkable performance in recent history. Such approaches tend to work in a unidirectional bottom-up feed-forward fashion. However, practical experience and biological evidence tells us that feedback plays a crucial role, particularly for detailed spatial understanding tasks. This work explores bidirectional architectures that also reason with top-down feedback: neural units are influenced by both lower and higher-level units. We do so by treating units as rectified latent variables in a quadratic energy function, which can be seen as a hierarchical Rectified Gaussian model (RGs). We show that RGs can be optimized with a quadratic program (QP), that can in turn be optimized with a recurrent neural network (with rectified linear units). This allows RGs to be trained with GPU-optimized gradient descent. From a theoretical perspective, RGs help establish a connection between CNNs and hierarchical probabilistic models. From a practical perspective, RGs are well suited for detailed spatial tasks that can benefit from top-down reasoning. We illustrate them on the challenging task of keypoint localization under occlusions, where local bottom-up evidence may be misleading. We demonstrate state-of-the-art results on challenging benchmarks.Comment: To appear in CVPR 201

    Understanding the Limitations of CNN-based Absolute Camera Pose Regression

    Full text link
    Visual localization is the task of accurate camera pose estimation in a known scene. It is a key problem in computer vision and robotics, with applications including self-driving cars, Structure-from-Motion, SLAM, and Mixed Reality. Traditionally, the localization problem has been tackled using 3D geometry. Recently, end-to-end approaches based on convolutional neural networks have become popular. These methods learn to directly regress the camera pose from an input image. However, they do not achieve the same level of pose accuracy as 3D structure-based methods. To understand this behavior, we develop a theoretical model for camera pose regression. We use our model to predict failure cases for pose regression techniques and verify our predictions through experiments. We furthermore use our model to show that pose regression is more closely related to pose approximation via image retrieval than to accurate pose estimation via 3D structure. A key result is that current approaches do not consistently outperform a handcrafted image retrieval baseline. This clearly shows that additional research is needed before pose regression algorithms are ready to compete with structure-based methods.Comment: Initial version of a paper accepted to CVPR 201

    Automatic Craniomaxillofacial Landmark Digitization via Segmentation-Guided Partially-Joint Regression Forest Model and Multiscale Statistical Features

    Get PDF
    The goal of this paper is to automatically digitize craniomaxillofacial (CMF) landmarks efficiently and accurately from cone-beam computed tomography (CBCT) images, by addressing the challenge caused by large morphological variations across patients and image artifacts of CBCT images

    A Collaborative Visual Localization Scheme for a Low-Cost Heterogeneous Robotic Team with Non-Overlapping Perspectives

    Get PDF
    This paper presents and evaluates a relative localization scheme for a heterogeneous team of low-cost mobile robots. An error-state, complementary Kalman Filter was developed to fuse analytically-derived uncertainty of stereoscopic pose measurements of an aerial robot, made by a ground robot, with the inertial/visual proprioceptive measurements of both robots. Results show that the sources of error, image quantization, asynchronous sensors, and a non-stationary bias, were sufficiently modeled to estimate the pose of the aerial robot. In both simulation and experiments, we demonstrate the proposed methodology with a heterogeneous robot team, consisting of a UAV and a UGV tasked with collaboratively localizing themselves while avoiding obstacles in an unknown environment. The team is able to identify a goal location and obstacles in the environment and plan a path for the UGV to the goal location. The results demonstrate localization accuracies of 2cm to 4cm, on average, while the robots operate at a distance from each-other between 1m and 4m

    Co-Localization of Audio Sources in Images Using Binaural Features and Locally-Linear Regression

    Get PDF
    This paper addresses the problem of localizing audio sources using binaural measurements. We propose a supervised formulation that simultaneously localizes multiple sources at different locations. The approach is intrinsically efficient because, contrary to prior work, it relies neither on source separation, nor on monaural segregation. The method starts with a training stage that establishes a locally-linear Gaussian regression model between the directional coordinates of all the sources and the auditory features extracted from binaural measurements. While fixed-length wide-spectrum sounds (white noise) are used for training to reliably estimate the model parameters, we show that the testing (localization) can be extended to variable-length sparse-spectrum sounds (such as speech), thus enabling a wide range of realistic applications. Indeed, we demonstrate that the method can be used for audio-visual fusion, namely to map speech signals onto images and hence to spatially align the audio and visual modalities, thus enabling to discriminate between speaking and non-speaking faces. We release a novel corpus of real-room recordings that allow quantitative evaluation of the co-localization method in the presence of one or two sound sources. Experiments demonstrate increased accuracy and speed relative to several state-of-the-art methods.Comment: 15 pages, 8 figure

    A selective overview of nonparametric methods in financial econometrics

    Full text link
    This paper gives a brief overview on the nonparametric techniques that are useful for financial econometric problems. The problems include estimation and inferences of instantaneous returns and volatility functions of time-homogeneous and time-dependent diffusion processes, and estimation of transition densities and state price densities. We first briefly describe the problems and then outline main techniques and main results. Some useful probabilistic aspects of diffusion processes are also briefly summarized to facilitate our presentation and applications.Comment: 32 pages include 7 figure
    • …
    corecore