756 research outputs found
Forest structure from terrestrial laser scanning – in support of remote sensing calibration/validation and operational inventory
Forests are an important part of the natural ecosystem, providing resources such as timber and fuel, performing services such as energy exchange and carbon storage, and presenting risks, such as fire damage and invasive species impacts. Improved characterization of forest structural attributes is desirable, as it could improve our understanding and management of these natural resources.
However, the traditional, systematic collection of forest information – dubbed “forest inventory” – is time-consuming, expensive, and coarse when compared to novel 3-D measurement technologies. Remote sensing estimates, on the other hand, provide synoptic coverage, but often fail to capture the fine- scale structural variation of the forest environment. Terrestrial laser scanning (TLS) has demonstrated a potential to address these limitations, but its operational use has remained limited due to unsatisfactory performance characteristics vs. budgetary constraints of many end-users.
To address this gap, my dissertation advanced affordable mobile laser scanning capabilities for operational forest structure assessment. We developed geometric reconstruction of forest structure from rapid-scan, low-resolution point cloud data, providing for automatic extraction of standard forest inventory metrics. To augment these results over larger areas, we designed a view-invariant feature descriptor to enable marker-free registration of TLS data pairs, without knowledge of the initial sensor pose. Finally, a graph-theory framework was integrated to perform multi-view registration between a network of disconnected scans, which provided improved assessment of forest inventory variables.
This work addresses a major limitation related to the inability of TLS to assess forest structure at an operational scale, and may facilitate improved understanding of the phenomenology of airborne sensing systems, by providing fine-scale reference data with which to interpret the active or passive electromagnetic radiation interactions with forest structure. Outputs are being utilized to provide antecedent science data for NASA’s HyspIRI mission and to support the National Ecological Observatory Network’s (NEON) long-term environmental monitoring initiatives
GraffMatch: Global Matching of 3D Lines and Planes for Wide Baseline LiDAR Registration
Using geometric landmarks like lines and planes can increase navigation
accuracy and decrease map storage requirements compared to commonly-used LiDAR
point cloud maps. However, landmark-based registration for applications like
loop closure detection is challenging because a reliable initial guess is not
available. Global landmark matching has been investigated in the literature,
but these methods typically use ad hoc representations of 3D line and plane
landmarks that are not invariant to large viewpoint changes, resulting in
incorrect matches and high registration error. To address this issue, we adopt
the affine Grassmannian manifold to represent 3D lines and planes and prove
that the distance between two landmarks is invariant to rotation and
translation if a shift operation is performed before applying the Grassmannian
metric. This invariance property enables the use of our graph-based data
association framework for identifying landmark matches that can subsequently be
used for registration in the least-squares sense. Evaluated on a challenging
landmark matching and registration task using publicly-available LiDAR
datasets, our approach yields a 1.7x and 3.5x improvement in successful
registrations compared to methods that use viewpoint-dependent centroid and
"closest point" representations, respectively.Comment: accepted to RA-L; 8 pages. arXiv admin note: text overlap with
arXiv:2205.0855
Automatic Reconstruction of Textured 3D Models
Three dimensional modeling and visualization of environments is an increasingly important problem. This work addresses the problem of automatic 3D reconstruction and we present a system for unsupervised reconstruction of textured 3D models in the context of modeling indoor environments. We present solutions to all aspects of the modeling process and an integrated system for the automatic creation of large scale 3D models
Visual Perception For Robotic Spatial Understanding
Humans understand the world through vision without much effort. We perceive the structure, objects, and people in the environment and pay little direct attention to most of it, until it becomes useful. Intelligent systems, especially mobile robots, have no such biologically engineered vision mechanism to take for granted. In contrast, we must devise algorithmic methods of taking raw sensor data and converting it to something useful very quickly. Vision is such a necessary part of building a robot or any intelligent system that is meant to interact with the world that it is somewhat surprising we don\u27t have off-the-shelf libraries for this capability.
Why is this? The simple answer is that the problem is extremely difficult. There has been progress, but the current state of the art is impressive and depressing at the same time. We now have neural networks that can recognize many objects in 2D images, in some cases performing better than a human. Some algorithms can also provide bounding boxes or pixel-level masks to localize the object. We have visual odometry and mapping algorithms that can build reasonably detailed maps over long distances with the right hardware and conditions. On the other hand, we have robots with many sensors and no efficient way to compute their relative extrinsic poses for integrating the data in a single frame. The same networks that produce good object segmentations and labels in a controlled benchmark still miss obvious objects in the real world and have no mechanism for learning on the fly while the robot is exploring. Finally, while we can detect pose for very specific objects, we don\u27t yet have a mechanism that detects pose that generalizes well over categories or that can describe new objects efficiently.
We contribute algorithms in four of the areas mentioned above. First, we describe a practical and effective system for calibrating many sensors on a robot with up to 3 different modalities. Second, we present our approach to visual odometry and mapping that exploits the unique capabilities of RGB-D sensors to efficiently build detailed representations of an environment. Third, we describe a 3-D over-segmentation technique that utilizes the models and ego-motion output in the previous step to generate temporally consistent segmentations with camera motion. Finally, we develop a synthesized dataset of chair objects with part labels and investigate the influence of parts on RGB-D based object pose recognition using a novel network architecture we call PartNet
IMU-based Online Multi-lidar Calibration
Modern autonomous systems typically use several sensors for perception. For
best performance, accurate and reliable extrinsic calibration is necessary. In
this research, we propose a reliable technique for the extrinsic calibration of
several lidars on a vehicle without the need for odometry estimation or
fiducial markers. First, our method generates an initial guess of the
extrinsics by matching the raw signals of IMUs co-located with each lidar. This
initial guess is then used in ICP and point cloud feature matching which
refines and verifies this estimate. Furthermore, we can use observability
criteria to choose a subset of the IMU measurements that have the highest
mutual information -- rather than comparing all the readings. We have
successfully validated our methodology using data gathered from Scania test
vehicles.Comment: For associated video, see https://youtu.be/HJ0CBWTFOh
Line Based Multi-Range Asymmetric Conditional Random Field For Terrestrial Laser Scanning Data Classification
Terrestrial Laser Scanning (TLS) is a ground-based, active imaging method that rapidly acquires accurate, highly dense three-dimensional point cloud of object surfaces by laser range finding. For fully utilizing its benefits, developing a robust method to classify many objects of interests from huge amounts of laser point clouds is urgently required. However, classifying massive TLS data faces many challenges, such as complex urban scene, partial data acquisition from occlusion. To make an automatic, accurate and robust TLS data classification, we present a line-based multi-range asymmetric Conditional Random Field algorithm.
The first contribution is to propose a line-base TLS data classification method. In this thesis, we are interested in seven classes: building, roof, pedestrian road (PR), tree, low man-made object (LMO), vehicle road (VR), and low vegetation (LV). The line-based classification is implemented in each scan profile, which follows the line profiling nature of laser scanning mechanism.Ten conventional local classifiers are tested, including popular generative and discriminative classifiers, and experimental results validate that the line-based method can achieve satisfying classification performance. However, local classifiers implement labeling task on individual line independently of its neighborhood, the inference of which often suffers from similar local appearance across different object classes. The second contribution is to propose a multi-range asymmetric Conditional Random Field (maCRF) model, which uses object context as post-classification to improve the performance of a local generative classifier. The maCRF incorporates appearance, local smoothness constraint, and global scene layout regularity together into a probabilistic graphical model. The local smoothness enforces that lines in a local area to have the same class label, while scene layout favours an asymmetric regularity of spatial arrangement between different object classes within long-range, which is considered both in vertical (above-bellow relation) and horizontal (front-behind) directions. The asymmetric regularity allows capturing directional spatial arrangement between pairwise objects (e.g. it allows ground is lower than building, not vice-versa). The third contribution is to extend the maCRF model by adding across scan profile context, which is called Across scan profile Multi-range Asymmetric Conditional Random Field (amaCRF) model. Due to the sweeping nature of laser scanning, the sequentially acquired TLS data has strong spatial dependency, and the across scan profile context can provide more contextual information. The final contribution is to propose a sequential classification strategy. Along the sweeping direction of laser scanning, amaCRF models were sequentially constructed. By dynamically updating posterior probability of common scan profiles, contextual information propagates through adjacent scan profiles
Development of a probabilistic perception system for camera-lidar sensor fusion
La estimación de profundidad usando diferentes sensores es uno de los desafíos clave para dotar a las máquinas autónomas de sólidas capacidades de percepción robótica. Ha habido un avance sobresaliente en el desarrollo de técnicas de estimación de profundidad unimodales basadas en cámaras monoculares, debido a su alta resolución o sensores LiDAR, debido a los datos geométricos precisos que proporcionan. Sin embargo, cada uno de ellos presenta inconvenientes inherentes, como la alta sensibilidad a los cambios en las condiciones de iluminación en el caso delas cámaras y la resolución limitada de los sensores LiDAR. La fusión de sensores se puede utilizar para combinar los méritos y compensar las desventajas de estos dos tipos de sensores. Sin embargo, los métodos de fusión actuales funcionan a un alto nivel. Procesan los flujos de datos de los sensores de forma independiente y combinan las estimaciones de alto nivel obtenidas para cada sensor. En este proyecto, abordamos el problema en un nivel bajo, fusionando los flujos de sensores sin procesar, obteniendo así estimaciones de profundidad que son densas y precisas, y pueden usarse como una fuente de datos multimodal unificada para problemas de estimación de nivel superior. Este trabajo propone un modelo de campo aleatorio condicional (CRF) con múltiples potenciales de geometría y apariencia que representa a la perfección el problema de estimar mapas de profundidad densos a partir de datos de cámara y LiDAR. El modelo se puede optimizar de manera eficiente utilizando el algoritmo Conjúgate Gradient Squared (CGS). El método propuesto se evalúa y compara utilizando el conjunto de datos proporcionado por KITTI Datset. Adicionalmente, se evalúa cualitativamente el modelo, usando datos adquiridos por el autor de esté trabajoMulti-modal depth estimation is one of the key challenges for endowing autonomous
machines with robust robotic perception capabilities. There has been an outstanding
advance in the development of uni-modal depth estimation techniques based
on either monocular cameras, because of their rich resolution or LiDAR sensors due
to the precise geometric data they provide. However, each of them suffers from some
inherent drawbacks like high sensitivity to changes in illumination conditions in
the case of cameras and limited resolution for the LiDARs. Sensor fusion can be
used to combine the merits and compensate the downsides of these two kinds of
sensors. Nevertheless, current fusion methods work at a high level. They processes
sensor data streams independently and combine the high level estimates obtained
for each sensor. In this thesis, I tackle the problem at a low level, fusing the raw
sensor streams, thus obtaining depth estimates which are both dense and precise,
and can be used as a unified multi-modal data source for higher level estimation
problems.
This work proposes a Conditional Random Field (CRF) model with multiple geometry
and appearance potentials that seamlessly represents the problem of estimating
dense depth maps from camera and LiDAR data. The model can be optimized
efficiently using the Conjugate Gradient Squared (CGS) algorithm. The proposed
method was evaluated and compared with the state-of-the-art using the commonly
used KITTI benchmark dataset. In addition, the model is qualitatively evaluated using
data acquired by the author of this work.MaestríaMagíster en Ingeniería de Desarrollo de Producto
- …