490 research outputs found
Map-Based Localization for Unmanned Aerial Vehicle Navigation
Unmanned Aerial Vehicles (UAVs) require precise pose estimation when navigating in indoor and GNSS-denied / GNSS-degraded outdoor environments. The possibility of crashing in these environments is high, as spaces are confined, with many moving obstacles. There are many solutions for localization in GNSS-denied environments, and many different technologies are used. Common solutions involve setting up or using existing infrastructure, such as beacons, Wi-Fi, or surveyed targets. These solutions were avoided because the cost should be proportional to the number of users, not the coverage area. Heavy and expensive sensors, for example a high-end IMU, were also avoided. Given these requirements, a camera-based localization solution was selected for the sensor pose estimation. Several camera-based localization approaches were investigated. Map-based localization methods were shown to be the most efficient because they close loops using a pre-existing map, thus the amount of data and the amount of time spent collecting data are reduced as there is no need to re-observe the same areas multiple times. This dissertation proposes a solution to address the task of fully localizing a monocular camera onboard a UAV with respect to a known environment (i.e., it is assumed that a 3D model of the environment is available) for the purpose of navigation for UAVs in structured environments.
Incremental map-based localization involves tracking a map through an image sequence. When the map is a 3D model, this task is referred to as model-based tracking. A by-product of the tracker is the relative 3D pose (position and orientation) between the camera and the object being tracked. State-of-the-art solutions advocate that tracking geometry is more robust than tracking image texture because edges are more invariant to changes in object appearance and lighting. However, model-based trackers have been limited to tracking small simple objects in small environments. An assessment was performed in tracking larger, more complex building models, in larger environments. A state-of-the art model-based tracker called ViSP (Visual Servoing Platform) was applied in tracking outdoor and indoor buildings using a UAVs low-cost camera. The assessment revealed weaknesses at large scales. Specifically, ViSP failed when tracking was lost, and needed to be manually re-initialized. Failure occurred when there was a lack of model features in the cameras field of view, and because of rapid camera motion. Experiments revealed that ViSP achieved positional accuracies similar to single point positioning solutions obtained from single-frequency (L1) GPS observations standard deviations around 10 metres. These errors were considered to be large, considering the geometric accuracy of the 3D model used in the experiments was 10 to 40 cm. The first contribution of this dissertation proposes to increase the performance of the localization system by combining ViSP with map-building incremental localization, also referred to as simultaneous localization and mapping (SLAM). Experimental results in both indoor and outdoor environments show sub-metre positional accuracies were achieved, while reducing the number of tracking losses throughout the image sequence. It is shown that by integrating model-based tracking with SLAM, not only does SLAM improve model tracking performance, but the model-based tracker alleviates the computational expense of SLAMs loop closing procedure to improve runtime performance. Experiments also revealed that ViSP was unable to handle occlusions when a complete 3D building model was used, resulting in large errors in its pose estimates. The second contribution of this dissertation is a novel map-based incremental localization algorithm that improves tracking performance, and increases pose estimation accuracies from ViSP. The novelty of this algorithm is the implementation of an efficient matching process that identifies corresponding linear features from the UAVs RGB image data and a large, complex, and untextured 3D model. The proposed model-based tracker improved positional accuracies from 10 m (obtained with ViSP) to 46 cm in outdoor environments, and improved from an unattainable result using VISP to 2 cm positional accuracies in large indoor environments.
The main disadvantage of any incremental algorithm is that it requires the camera pose of the first frame. Initialization is often a manual process. The third contribution of this dissertation is a map-based absolute localization algorithm that automatically estimates the camera pose when no prior pose information is available. The method benefits from vertical line matching to accomplish a registration procedure of the reference model views with a set of initial input images via geometric hashing. Results demonstrate that sub-metre positional accuracies were achieved and a proposed enhancement of conventional geometric hashing produced more correct matches - 75% of the correct matches were identified, compared to 11%. Further the number of incorrect matches was reduced by 80%
Learning to Transform Time Series with a Few Examples
We describe a semi-supervised regression algorithm that learns to transform one time series into another time series given examples of the transformation. This algorithm is applied to tracking, where a time series of observations from sensors is transformed to a time series describing the pose of a target. Instead of defining and implementing such transformations for each tracking task separately, our algorithm learns a memoryless transformation of time series from a few example input-output mappings. The algorithm searches for a smooth function that fits the training examples and, when applied to the input time series, produces a time series that evolves according to assumed dynamics. The learning procedure is fast and lends itself to a closed-form solution. It is closely related to nonlinear system identification and manifold learning techniques. We demonstrate our algorithm on the tasks of tracking RFID tags from signal strength measurements, recovering the pose of rigid objects, deformable bodies, and articulated bodies from video sequences. For these tasks, this algorithm requires significantly fewer examples compared to fully-supervised regression algorithms or semi-supervised learning algorithms that do not take the dynamics of the output time series into account
Kernelized Locality-Sensitive Hashing for Fast Image Landmark Association
As the concept of war has evolved, navigation in urban environments where GPS may be degraded is increasingly becoming more important. Two existing solutions are vision-aided navigation and vision-based Simultaneous Localization and Mapping (SLAM). The problem, however, is that vision-based navigation techniques can require excessive amounts of memory and increased computational complexity resulting in a decrease in speed. This research focuses on techniques to improve such issues by speeding up and optimizing the data association process in vision-based SLAM. Specifically, this work studies the current methods that algorithms use to associate a current robot pose to that of one previously seen and introduce another method to the image mapping arena for comparison. The current method, kd-trees, is effcient in lower dimensions, but does not narrow the search space enough in higher dimensional datasets. In this research, Kernelized Locality-Sensitive Hashing (KLSH) is implemented to conduct the aforementioned pose associations. Results on KLSH shows that fewer image comparisons are required for location identification than that of other methods. This work can then be extended into a vision-SLAM implementation to subsequently produce a map
Vision-based retargeting for endoscopic navigation
Endoscopy is a standard procedure for visualising the human gastrointestinal tract. With the advances in biophotonics, imaging techniques such as narrow band imaging, confocal laser endomicroscopy, and optical coherence tomography can be combined with normal endoscopy for assisting the early diagnosis of diseases, such as cancer. In the past decade, optical biopsy has emerged to be an effective tool for tissue analysis, allowing in vivo and in situ assessment of pathological sites with real-time feature-enhanced microscopic images. However, the non-invasive nature of optical biopsy leads to an intra-examination retargeting problem, which is associated with the difficulty of re-localising a biopsied site consistently throughout the whole examination. In addition to intra-examination retargeting, retargeting of a pathological site is even more challenging across examinations, due to tissue deformation and changing tissue morphologies and appearances. The purpose of this thesis is to address both the intra- and inter-examination retargeting problems associated with optical biopsy. We propose a novel vision-based framework for intra-examination retargeting. The proposed framework is based on combining visual tracking and detection with online learning of the appearance of the biopsied site. Furthermore, a novel cascaded detection approach based on random forests and structured support vector machines is developed to achieve efficient retargeting. To cater for reliable inter-examination retargeting, the solution provided in this thesis is achieved by solving an image retrieval problem, for which an online scene association approach is proposed to summarise an endoscopic video collected in the first examination into distinctive scenes. A hashing-based approach is then used to learn the intrinsic representations of these scenes, such that retargeting can be achieved in subsequent examinations by retrieving the relevant images using the learnt representations. For performance evaluation of the proposed frameworks, extensive phantom, ex vivo and in vivo experiments have been conducted, with results demonstrating the robustness and potential clinical values of the methods proposed.Open Acces
Efficient Configuration Space Construction and Optimization for Motion Planning
The configuration space is a fundamental concept that is widely used in algorithmic robotics. Many applications in robotics, computer-aided design, and related areas can be reduced to computational problems in terms of configuration spaces. In this paper, we survey some of our recent work on solving two important challenges related to configuration spaces
Nonrigid reconstruction of 3D breast surfaces with a low-cost RGBD camera for surgical planning and aesthetic evaluation
Accounting for 26% of all new cancer cases worldwide, breast cancer remains
the most common form of cancer in women. Although early breast cancer has a
favourable long-term prognosis, roughly a third of patients suffer from a
suboptimal aesthetic outcome despite breast conserving cancer treatment.
Clinical-quality 3D modelling of the breast surface therefore assumes an
increasingly important role in advancing treatment planning, prediction and
evaluation of breast cosmesis. Yet, existing 3D torso scanners are expensive
and either infrastructure-heavy or subject to motion artefacts. In this paper
we employ a single consumer-grade RGBD camera with an ICP-based registration
approach to jointly align all points from a sequence of depth images
non-rigidly. Subtle body deformation due to postural sway and respiration is
successfully mitigated leading to a higher geometric accuracy through
regularised locally affine transformations. We present results from 6 clinical
cases where our method compares well with the gold standard and outperforms a
previous approach. We show that our method produces better reconstructions
qualitatively by visual assessment and quantitatively by consistently obtaining
lower landmark error scores and yielding more accurate breast volume estimates
A Survey on Graph Kernels
Graph kernels have become an established and widely-used technique for
solving classification tasks on graphs. This survey gives a comprehensive
overview of techniques for kernel-based graph classification developed in the
past 15 years. We describe and categorize graph kernels based on properties
inherent to their design, such as the nature of their extracted graph features,
their method of computation and their applicability to problems in practice. In
an extensive experimental evaluation, we study the classification accuracy of a
large suite of graph kernels on established benchmarks as well as new datasets.
We compare the performance of popular kernels with several baseline methods and
study the effect of applying a Gaussian RBF kernel to the metric induced by a
graph kernel. In doing so, we find that simple baselines become competitive
after this transformation on some datasets. Moreover, we study the extent to
which existing graph kernels agree in their predictions (and prediction errors)
and obtain a data-driven categorization of kernels as result. Finally, based on
our experimental results, we derive a practitioner's guide to kernel-based
graph classification
DEANN: Speeding up Kernel-Density Estimation using Approximate Nearest Neighbor Search
Kernel Density Estimation (KDE) is a nonparametric method for estimating the
shape of a density function, given a set of samples from the distribution.
Recently, locality-sensitive hashing, originally proposed as a tool for nearest
neighbor search, has been shown to enable fast KDE data structures. However,
these approaches do not take advantage of the many other advances that have
been made in algorithms for nearest neighbor algorithms. We present an
algorithm called Density Estimation from Approximate Nearest Neighbors (DEANN)
where we apply Approximate Nearest Neighbor (ANN) algorithms as a black box
subroutine to compute an unbiased KDE. The idea is to find points that have a
large contribution to the KDE using ANN, compute their contribution exactly,
and approximate the remainder with Random Sampling (RS). We present a
theoretical argument that supports the idea that an ANN subroutine can speed up
the evaluation. Furthermore, we provide a C++ implementation with a Python
interface that can make use of an arbitrary ANN implementation as a subroutine
for KDE evaluation. We show empirically that our implementation outperforms
state of the art implementations in all high dimensional datasets we
considered, and matches the performance of RS in cases where the ANN yield no
gains in performance.Comment: 24 pages, 1 figure. Submitted for revie
- …