6,435 research outputs found
3D Visual Perception for Self-Driving Cars using a Multi-Camera System: Calibration, Mapping, Localization, and Obstacle Detection
Cameras are a crucial exteroceptive sensor for self-driving cars as they are
low-cost and small, provide appearance information about the environment, and
work in various weather conditions. They can be used for multiple purposes such
as visual navigation and obstacle detection. We can use a surround multi-camera
system to cover the full 360-degree field-of-view around the car. In this way,
we avoid blind spots which can otherwise lead to accidents. To minimize the
number of cameras needed for surround perception, we utilize fisheye cameras.
Consequently, standard vision pipelines for 3D mapping, visual localization,
obstacle detection, etc. need to be adapted to take full advantage of the
availability of multiple cameras rather than treat each camera individually. In
addition, processing of fisheye images has to be supported. In this paper, we
describe the camera calibration and subsequent processing pipeline for
multi-fisheye-camera systems developed as part of the V-Charge project. This
project seeks to enable automated valet parking for self-driving cars. Our
pipeline is able to precisely calibrate multi-camera systems, build sparse 3D
maps for visual navigation, visually localize the car with respect to these
maps, generate accurate dense maps, as well as detect obstacles based on
real-time depth map extraction
Computer-assisted polyp matching between optical colonoscopy and CT colonography: a phantom study
Potentially precancerous polyps detected with CT colonography (CTC) need to
be removed subsequently, using an optical colonoscope (OC). Due to large
colonic deformations induced by the colonoscope, even very experienced
colonoscopists find it difficult to pinpoint the exact location of the
colonoscope tip in relation to polyps reported on CTC. This can cause unduly
prolonged OC examinations that are stressful for the patient, colonoscopist and
supporting staff.
We developed a method, based on monocular 3D reconstruction from OC images,
that automatically matches polyps observed in OC with polyps reported on prior
CTC. A matching cost is computed, using rigid point-based registration between
surface point clouds extracted from both modalities. A 3D printed and painted
phantom of a 25 cm long transverse colon segment was used to validate the
method on two medium sized polyps. Results indicate that the matching cost is
smaller at the correct corresponding polyp between OC and CTC: the value is 3.9
times higher at the incorrect polyp, comparing the correct match between polyps
to the incorrect match. Furthermore, we evaluate the matching of the
reconstructed polyp from OC with other colonic endoluminal surface structures
such as haustral folds and show that there is a minimum at the correct polyp
from CTC.
Automated matching between polyps observed at OC and prior CTC would
facilitate the biopsy or removal of true-positive pathology or exclusion of
false-positive CTC findings, and would reduce colonoscopy false-negative
(missed) polyps. Ultimately, such a method might reduce healthcare costs,
patient inconvenience and discomfort.Comment: This paper was presented at the SPIE Medical Imaging 2014 conferenc
Real-time Monocular Object SLAM
We present a real-time object-based SLAM system that leverages the largest
object database to date. Our approach comprises two main components: 1) a
monocular SLAM algorithm that exploits object rigidity constraints to improve
the map and find its real scale, and 2) a novel object recognition algorithm
based on bags of binary words, which provides live detections with a database
of 500 3D objects. The two components work together and benefit each other: the
SLAM algorithm accumulates information from the observations of the objects,
anchors object features to especial map landmarks and sets constrains on the
optimization. At the same time, objects partially or fully located within the
map are used as a prior to guide the recognition algorithm, achieving higher
recall. We evaluate our proposal on five real environments showing improvements
on the accuracy of the map and efficiency with respect to other
state-of-the-art techniques
Geometric Inference with Microlens Arrays
This dissertation explores an alternative to traditional fiducial markers where geometric
information is inferred from the observed position of 3D points seen in an image. We offer an alternative approach which enables geometric inference based on the relative orientation
of markers in an image. We present markers fabricated from microlenses whose appearance
changes depending on the marker\u27s orientation relative to the camera. First, we show how
to manufacture and calibrate chromo-coding lenticular arrays to create a known relationship
between the observed hue and orientation of the array. Second, we use 2 small chromo-coding lenticular arrays to estimate the pose of an object. Third, we use 3 large chromo-coding lenticular arrays to calibrate a camera with a single image. Finally, we create another type of fiducial marker from lenslet arrays that encode orientation with discrete black and white appearances. Collectively, these approaches oer new opportunities for pose estimation and camera calibration that are relevant for robotics, virtual reality, and augmented reality
Training a Convolutional Neural Network for Appearance-Invariant Place Recognition
Place recognition is one of the most challenging problems in computer vision,
and has become a key part in mobile robotics and autonomous driving
applications for performing loop closure in visual SLAM systems. Moreover, the
difficulty of recognizing a revisited location increases with appearance
changes caused, for instance, by weather or illumination variations, which
hinders the long-term application of such algorithms in real environments. In
this paper we present a convolutional neural network (CNN), trained for the
first time with the purpose of recognizing revisited locations under severe
appearance changes, which maps images to a low dimensional space where
Euclidean distances represent place dissimilarity. In order for the network to
learn the desired invariances, we train it with triplets of images selected
from datasets which present a challenging variability in visual appearance. The
triplets are selected in such way that two samples are from the same location
and the third one is taken from a different place. We validate our system
through extensive experimentation, where we demonstrate better performance than
state-of-art algorithms in a number of popular datasets
On the structural nature of cooperation in distributed network localization
We demonstrate analytically that the contribution of cooperation in improving the accuracy of distributed network localization has a fundamentally structural nature, rather then statistical as widely believed. To this end we first introduce a new approach to build Fisher Information Matrices (FIMs), in which the individual contribution of each cooperative pair of nodes is captured explicitly by a corresponding information vector. The approach offers new insight onto the structure of FIMs, enabling us to easily account for both anchor and node location uncertainties in assessing lower bounds on localization errors. Using this construction it is surprisingly found that in the presence of node location uncertainty and regardless of ranging error variances or network size, the Fisher information matrix (FIM) terms corresponding to the information added by node-to-node cooperation nearly vanish. In other words, the analysis reveals that the key contribution of cooperation in network localization is not to add statistical node-to-node information (in the Fisher sense), but rather to provide a structure over which information is better exploited
- …