187 research outputs found
Sparse Bayesian information filters for localization and mapping
Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the Massachusetts Institute of Technology and the Woods Hole Oceanographic Institution February 2008This thesis formulates an estimation framework for Simultaneous Localization and
Mapping (SLAM) that addresses the problem of scalability in large environments.
We describe an estimation-theoretic algorithm that achieves significant gains in computational
efficiency while maintaining consistent estimates for the vehicle pose and
the map of the environment.
We specifically address the feature-based SLAM problem in which the robot represents
the environment as a collection of landmarks. The thesis takes a Bayesian
approach whereby we maintain a joint posterior over the vehicle pose and feature
states, conditioned upon measurement data. We model the distribution as Gaussian
and parametrize the posterior in the canonical form, in terms of the information
(inverse covariance) matrix. When sparse, this representation is amenable to computationally
efficient Bayesian SLAM filtering. However, while a large majority of the
elements within the normalized information matrix are very small in magnitude, it is
fully populated nonetheless. Recent feature-based SLAM filters achieve the scalability
benefits of a sparse parametrization by explicitly pruning these weak links in an effort
to enforce sparsity. We analyze one such algorithm, the Sparse Extended Information
Filter (SEIF), which has laid much of the groundwork concerning the computational
benefits of the sparse canonical form. The thesis performs a detailed analysis of the
process by which the SEIF approximates the sparsity of the information matrix and
reveals key insights into the consequences of different sparsification strategies. We
demonstrate that the SEIF yields a sparse approximation to the posterior that is inconsistent,
suffering from exaggerated confidence estimates. This overconfidence has
detrimental effects on important aspects of the SLAM process and affects the higher
level goal of producing accurate maps for subsequent localization and path planning.
This thesis proposes an alternative scalable filter that maintains sparsity while
preserving the consistency of the distribution. We leverage insights into the natural
structure of the feature-based canonical parametrization and derive a method that
actively maintains an exactly sparse posterior. Our algorithm exploits the structure
of the parametrization to achieve gains in efficiency, with a computational cost that
scales linearly with the size of the map. Unlike similar techniques that sacrifice
consistency for improved scalability, our algorithm performs inference over a posterior
that is conservative relative to the nominal Gaussian distribution. Consequently, we
preserve the consistency of the pose and map estimates and avoid the effects of an
overconfident posterior.
We demonstrate our filter alongside the SEIF and the standard EKF both in simulation
as well as on two real-world datasets. While we maintain the computational
advantages of an exactly sparse representation, the results show convincingly that
our method yields conservative estimates for the robot pose and map that are nearly
identical to those of the original Gaussian distribution as produced by the EKF, but
at much less computational expense.
The thesis concludes with an extension of our SLAM filter to a complex underwater
environment. We describe a systems-level framework for localization and mapping
relative to a ship hull with an Autonomous Underwater Vehicle (AUV) equipped
with a forward-looking sonar. The approach utilizes our filter to fuse measurements
of vehicle attitude and motion from onboard sensors with data from sonar images of
the hull. We employ the system to perform three-dimensional, 6-DOF SLAM on a
ship hull
Visually Augmented Navigation for Autonomous Underwater Vehicles
As autonomous underwater vehicles (AUVs) are becoming routinely used in an exploratory context for ocean science, the goal of visually augmented navigation (VAN) is to improve the near-seafloor navigation precision of such vehicles without imposing the burden of having to deploy additional infrastructure. This is in contrast to traditional acoustic long baseline navigation techniques, which require the deployment, calibration, and eventual recovery of a transponder network. To achieve this goal, VAN is formulated within a vision-based simultaneous localization and mapping (SLAM) framework that exploits the systems-level complementary aspects of a camera and strap-down sensor suite. The result is an environmentally based navigation technique robust to the peculiarities of low-overlap underwater imagery. The method employs a view-based representation where camera-derived relative-pose measurements provide spatial constraints, which enforce trajectory consistency and also serve as a mechanism for loop closure, allowing for error growth to be independent of time for revisited imagery. This article outlines the multisensor VAN framework and demonstrates it to have compelling advantages over a purely vision-only approach by: 1) improving the robustness of low-overlap underwater image registration; 2) setting the free gauge scale; and 3) allowing for a disconnected camera-constraint topology.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/86054/1/reustice-16.pd
A Large Scale Inertial Aided Visual Simultaneous Localization And Mapping (SLAM) System For Small Mobile Platforms
In this dissertation we present a robust simultaneous mapping and localization scheme that can be deployed on a computationally limited, small unmanned aerial system. This is achieved by developing a key frame based algorithm that leverages the multiprocessing capacity of modern low power mobile processors. The novelty of the algorithm lies in the design to make it robust against rapid exploration while keeping the computational time to a minimum. A novel algorithm is developed where the time critical components of the localization and mapping system are computed in parallel utilizing the multiple cores of the processor. The algorithm uses a scale and rotation invariant state of the art binary descriptor for landmark description making it suitable for compact large scale map representation and robust tracking. This descriptor is also used in loop closure detection making the algorithm efficient by eliminating any need for separate descriptors in a Bag of Words scheme. Effectiveness of the algorithm is demonstrated by performance evaluation in indoor and large scale outdoor dataset. We demonstrate the efficiency and robustness of the algorithm by successful six degree of freedom (6 DOF) pose estimation in challenging indoor and outdoor environment. Performance of the algorithm is validated on a quadcopter with onboard computation
Interlandmark measurements from lodox statscan images with application to femoral neck anteversion assessment
Includes abstract.Includes bibliographical references.Clinicians often take measurements between anatomical landmarks on X-ray radiographs for diagnosis and treatment planning, for example in orthopaedics and orthodontics. X-ray images, however, overlap three-dimensional internal structures onto a two-dimensional plane during image formation. Depth information is therefore lost and measurements do not truly reflect spatial relationships. The main aim of this study was to develop an inter-landmark measurement tool for the Lodox Statscan digital radiography system. X-ray stereophotogrammetry was applied to Statscan images to enable three-dimensional point localization for inter-landmark measurement using two-dimensional radiographs. This technique requires images of the anatomical region of interest to be acquired from different perspectives as well as a suitable calibration tool to map image coordinates to real world coordinates. The Statscan is suited to the technique because it is capable of axial rotations for multiview imaging. Three-dimensional coordinate reconstruction and inter-landmark measurements were taken using a planar object and a dry pelvis specimen in order to assess the intra-observer measurement accuracy, reliability and precision. The system yielded average (X, Y, Z) coordinate reconstruction accuracy of (0.08 0.12 0.34) mm and resultant coordinate reconstruction accuracy within 0.4mm (range 0.3mm – 0.6mm). Inter-landmark measurements within 2mm for lengths and 1.80 for angles were obtained, with average accuracies of 0.4mm (range 0.0mm – 2.0 mm) and 0.30 (range 0.0 – 1.8)0 respectively. The results also showed excellent overall precision of (0.5mm, 0.10) and were highly reliable when all landmarks were completely visible in both images. Femoral neck anteversion measurement on Statscan images was also explored using 30 dry right adult femurs. This was done in order to assess the feasibility of the algorithm for a clinical application. For this investigation, four methods were tested to determine the optimal landmarks for measurement and the measurement process involved calculation of virtual landmarks. The method that yielded the best results produced all measurements within 10 of reference values and the measurements were highly reliable with very good precision within 0.10. The average accuracy was within 0.40 (range 0.10 –0.80).In conclusion, X-ray stereophotogrammetry enables accurate, reliable and precise inter-landmark measurements for the Lodox Statscan X-ray imaging system. The machine may therefore be used as an inter-landmark measurement tool for routine clinical applications
Recommended from our members
High-quality dense stereo vision for whole body imaging and obesity assessment
textThe prevalence of obesity has necessitated developing safe and convenient tools for timely assessing and monitoring this condition for a broad range of population. Three-dimensional (3D) body imaging has become a new mean for obesity assessment. Moreover, it generates body shape information that is meaningful for fitness, ergonomics, and personalized clothing. In the previous work of our lab, we developed a prototype active stereo vision system that demonstrated a potential to fulfill this goal. But the prototype required four computer projectors to cast artificial textures on the body which facilitate the stereo-matching on texture-deficient images (e.g., skin). This decreases the mobility of the system when used to collect a large population data. In addition, the resolution of the generated 3D~images is limited by both cameras and projectors available during the project. The study reported in this dissertation highlights our continued effort in improving the capability of 3Dbody imaging through simplified hardware for passive stereo and advanced computation techniques.
The system utilizes high-resolution single-lens reflex (SLR) cameras, which became widely available lately, and is configured in a two-stance design to image the front and back surfaces of a person. A total of eight cameras are used to form four pairs of stereo units. Each unit covers a quarter of the body surface. The stereo units are individually calibrated with a specific pattern to determine cameras' intrinsic and extrinsic parameters for stereo matching. The global orientation and position of each stereo unit within a common world coordinate system is calculated through a 3Dregistration step. The stereo calibration and 3Dregistration procedures do not need to be repeated for a deployed system if the cameras' relative positions have not changed. This property contributes to the portability of the system, and tremendously alleviates the maintenance task. The image acquisition time is around two seconds for a whole-body capture. The system works in an indoor environment with a moderate ambient light.
Advanced stereo computation algorithms are developed by taking advantage of high-resolution images and by tackling the ambiguity problem in stereo matching. A multi-scale, coarse-to-fine matching framework is proposed to match large-scale textures at a low resolution and refine the matched results over higher resolutions. This matching strategy reduces the complexity of the computation and avoids ambiguous matching at the native resolution. The pixel-to-pixel stereo matching algorithm follows a classic, four-step strategy which consists of matching cost computation, cost aggregation, disparity computation and disparity refinement.
The system performance has been evaluated on mannequins and human subjects in comparison with other measurement methods. It was found that the geometrical measurements from reconstructed 3Dbody models, including body circumferences and whole volume, are highly repeatable and consistent with manual and other instrumental measurements (CV 0.99). The agreement of percent body fat (%BF) estimation on human subjects between stereo and dual-energy X-ray absorptiometry (DEXA) was found to be improved over the previous active stereo system, and the limits of agreement with 95% confidence were reduced by half. Our achieved %BF estimation agreement is among the lowest ones of other comparative studies with commercialized air displacement plethysmography (ADP) and DEXA. In practice, %BF estimation through a two-component model is sensitive to body volume measurement, and the estimation of lung volume could be a source of variation. Protocols for this type of measurement should still be created with an awareness of this factor.Biomedical Engineerin
Large-area visually augmented navigation for autonomous underwater vehicles
Submitted to the Joint Program in Applied Ocean Science & Engineering
in partial fulfillment of the requirements for the degree of Doctor of Philosophy
at the Massachusetts Institute of Technology
and the Woods Hole Oceanographic Institution
June 2005This thesis describes a vision-based, large-area, simultaneous localization and mapping (SLAM) algorithm that respects the low-overlap imagery constraints typical of autonomous underwater vehicles (AUVs) while exploiting the inertial sensor information that is routinely available on such platforms. We adopt a systems-level approach exploiting the complementary aspects of inertial sensing and visual perception from a calibrated pose-instrumented platform. This systems-level strategy yields a robust solution to underwater imaging that
overcomes many of the unique challenges of a marine environment (e.g., unstructured terrain, low-overlap imagery, moving light source). Our large-area SLAM algorithm recursively incorporates relative-pose constraints using a view-based representation that exploits exact sparsity in the Gaussian canonical form. This sparsity allows for efficient O(n) update complexity in the number of images composing the view-based map by utilizing recent multilevel relaxation techniques. We show that our algorithmic formulation is inherently sparse unlike other feature-based canonical SLAM algorithms, which impose sparseness via pruning approximations. In particular, we investigate
the sparsification methodology employed by sparse extended information filters (SEIFs)
and offer new insight as to why, and how, its approximation can lead to inconsistencies in
the estimated state errors. Lastly, we present a novel algorithm for efficiently extracting consistent marginal covariances useful for data association from the information matrix. In summary, this thesis advances the current state-of-the-art in underwater visual navigation by demonstrating end-to-end automatic processing of the largest visually navigated dataset to date using data collected from a survey of the RMS Titanic (path length over 3 km and 3100 m2 of mapped area). This accomplishment embodies the summed contributions of this thesis to several current SLAM research issues including scalability, 6 degree of
freedom motion, unstructured environments, and visual perception.This work was funded in part by the CenSSIS ERC of the National Science Foundation
under grant EEC-9986821, in part by the Woods Hole Oceanographic Institution through a
grant from the Penzance Foundation, and in part by a NDSEG Fellowship awarded through
the Department of Defense
Large Area 3-D Reconstructions from Underwater Optical Surveys
Robotic underwater vehicles are regularly performing vast optical surveys of the ocean floor. Scientists value these surveys since optical images offer high levels of detail and are easily interpreted by humans. Unfortunately, the coverage of a single image is limited by absorption and backscatter while what is generally desired is an overall view of the survey area. Recent works on underwater mosaics assume planar scenes and are applicable only to situations without much relief. We present a complete and validated system for processing optical images acquired from an underwater robotic vehicle to form a 3D reconstruction of the ocean floor. Our approach is designed for the most general conditions of wide-baseline imagery (low overlap and presence of significant 3D structure) and scales to hundreds or thousands of images. We only assume a calibrated camera system and a vehicle with uncertain and possibly drifting pose information (e.g., a compass, depth sensor, and a Doppler velocity log). Our approach is based on a combination of techniques from computer vision, photogrammetry, and robotics. We use a local to global approach to structure from motion, aided by the navigation sensors on the vehicle to generate 3D sub-maps. These sub-maps are then placed in a common reference frame that is refined by matching overlapping sub-maps. The final stage of processing is a bundle adjustment that provides the 3D structure, camera poses, and uncertainty estimates in a consistent reference frame. We present results with ground truth for structure as well as results from an oceanographic survey over a coral reef.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/86036/1/opizarro-12.pd
A gradient-based approach to fast and accurate head motion compensation in cone-beam CT
Cone-beam computed tomography (CBCT) systems, with their portability, present
a promising avenue for direct point-of-care medical imaging, particularly in
critical scenarios such as acute stroke assessment. However, the integration of
CBCT into clinical workflows faces challenges, primarily linked to long scan
duration resulting in patient motion during scanning and leading to image
quality degradation in the reconstructed volumes. This paper introduces a novel
approach to CBCT motion estimation using a gradient-based optimization
algorithm, which leverages generalized derivatives of the backprojection
operator for cone-beam CT geometries. Building on that, a fully differentiable
target function is formulated which grades the quality of the current motion
estimate in reconstruction space. We drastically accelerate motion estimation
yielding a 19-fold speed-up compared to existing methods. Additionally, we
investigate the architecture of networks used for quality metric regression and
propose predicting voxel-wise quality maps, favoring autoencoder-like
architectures over contracting ones. This modification improves gradient flow,
leading to more accurate motion estimation. The presented method is evaluated
through realistic experiments on head anatomy. It achieves a reduction in
reprojection error from an initial average of 3mm to 0.61mm after motion
compensation and consistently demonstrates superior performance compared to
existing approaches. The analytic Jacobian for the backprojection operation,
which is at the core of the proposed method, is made publicly available. In
summary, this paper contributes to the advancement of CBCT integration into
clinical workflows by proposing a robust motion estimation approach that
enhances efficiency and accuracy, addressing critical challenges in
time-sensitive scenarios.Comment: This work has been submitted to the IEEE for possible publication.
Copyright may be transferred without notice, after which this version may no
longer be accessibl
- …