515 research outputs found
Lidar-based Obstacle Detection and Recognition for Autonomous Agricultural Vehicles
Today, agricultural vehicles are available that can drive autonomously and follow exact route plans more precisely than human operators. Combined with advancements in precision agriculture, autonomous agricultural robots can reduce manual labor, improve workflow, and optimize yield. However, as of today, human operators are still required for monitoring the environment and acting upon potential obstacles in front of the vehicle. To eliminate this need, safety must be ensured by accurate and reliable obstacle detection and avoidance systems.In this thesis, lidar-based obstacle detection and recognition in agricultural environments has been investigated. A rotating multi-beam lidar generating 3D point clouds was used for point-wise classification of agricultural scenes, while multi-modal fusion with cameras and radar was used to increase performance and robustness. Two research perception platforms were presented and used for data acquisition. The proposed methods were all evaluated on recorded datasets that represented a wide range of realistic agricultural environments and included both static and dynamic obstacles.For 3D point cloud classification, two methods were proposed for handling density variations during feature extraction. One method outperformed a frequently used generic 3D feature descriptor, whereas the other method showed promising preliminary results using deep learning on 2D range images. For multi-modal fusion, four methods were proposed for combining lidar with color camera, thermal camera, and radar. Gradual improvements in classification accuracy were seen, as spatial, temporal, and multi-modal relationships were introduced in the models. Finally, occupancy grid mapping was used to fuse and map detections globally, and runtime obstacle detection was applied on mapped detections along the vehicle path, thus simulating an actual traversal.The proposed methods serve as a first step towards full autonomy for agricultural vehicles. The study has thus shown that recent advancements in autonomous driving can be transferred to the agricultural domain, when accurate distinctions are made between obstacles and processable vegetation. Future research in the domain has further been facilitated with the release of the multi-modal obstacle dataset, FieldSAFE
Informal settlement segmentation using VHR RGB and height information from UAV imagery: a case study of Nepal
Dissertation submitted in partial fulfilment of the requirements for the Degree of Master of Science in Geospatial TechnologiesInformal settlement in developing countries are complex. They are contextually
and radiometrically very similar to formal settlement. Resolution
offered by Remote sensing is not sufficient to capture high variations and feature
size in informal settlements in these situations. UAV imageries offers
solution with higher resolution. Incorporating UAV image and normalized
DSM obtained from UAV provides an opportunity of including information
on 3D space. This can be a crucial factor for informal settlement extraction
in countries like Nepal. While formal and informal settlements have similar
texture, they differ significantly in height. In this regard, we propose segmentation
of informal settlement of Nepal using UAV and normalized DSM, against
traditional approach of orthophoto only or orthophoto and DSM. Absolute
height, normalized DSM(nDSM) and vegetation index from visual band added
to 8 bit RGB channels are used to locate informal settlements. Segmentation
including nDSM resulted in 6 % increment in Intersection over Union for informal
settlements. IoU of 85% for informal settlement is obtained using nDSM
trained end to end on Resnet18 based Unet. Use of threshold value had same
effect as using absolute height, meaning use of threshold does not alter result
from using absolute nDSM. Integration of height as additional band showed
better performance over model that trained height separately. Interestingly,
benefits of vegetation index is limited to settlements with small huts partly
covered with vegetation, which has no or negative effect elsewhere
Elevation Estimation-Driven Building 3D Reconstruction from Single-View Remote Sensing Imagery
Building 3D reconstruction from remote sensing images has a wide range of
applications in smart cities, photogrammetry and other fields. Methods for
automatic 3D urban building modeling typically employ multi-view images as
input to algorithms to recover point clouds and 3D models of buildings.
However, such models rely heavily on multi-view images of buildings, which are
time-intensive and limit the applicability and practicality of the models. To
solve these issues, we focus on designing an efficient DSM estimation-driven
reconstruction framework (Building3D), which aims to reconstruct 3D building
models from the input single-view remote sensing image. First, we propose a
Semantic Flow Field-guided DSM Estimation (SFFDE) network, which utilizes the
proposed concept of elevation semantic flow to achieve the registration of
local and global features. Specifically, in order to make the network semantics
globally aware, we propose an Elevation Semantic Globalization (ESG) module to
realize the semantic globalization of instances. Further, in order to alleviate
the semantic span of global features and original local features, we propose a
Local-to-Global Elevation Semantic Registration (L2G-ESR) module based on
elevation semantic flow. Our Building3D is rooted in the SFFDE network for
building elevation prediction, synchronized with a building extraction network
for building masks, and then sequentially performs point cloud reconstruction,
surface reconstruction (or CityGML model reconstruction). On this basis, our
Building3D can optionally generate CityGML models or surface mesh models of the
buildings. Extensive experiments on ISPRS Vaihingen and DFC2019 datasets on the
DSM estimation task show that our SFFDE significantly improves upon
state-of-the-arts. Furthermore, our Building3D achieves impressive results in
the 3D point cloud and 3D model reconstruction process
TractorEYE: Vision-based Real-time Detection for Autonomous Vehicles in Agriculture
Agricultural vehicles such as tractors and harvesters have for decades been able to navigate automatically and more efficiently using commercially available products such as auto-steering and tractor-guidance systems. However, a human operator is still required inside the vehicle to ensure the safety of vehicle and especially surroundings such as humans and animals. To get fully autonomous vehicles certified for farming, computer vision algorithms and sensor technologies must detect obstacles with equivalent or better than human-level performance. Furthermore, detections must run in real-time to allow vehicles to actuate and avoid collision.This thesis proposes a detection system (TractorEYE), a dataset (FieldSAFE), and procedures to fuse information from multiple sensor technologies to improve detection of obstacles and to generate a map. TractorEYE is a multi-sensor detection system for autonomous vehicles in agriculture. The multi-sensor system consists of three hardware synchronized and registered sensors (stereo camera, thermal camera and multi-beam lidar) mounted on/in a ruggedized and water-resistant casing. Algorithms have been developed to run a total of six detection algorithms (four for rgb camera, one for thermal camera and one for a Multi-beam lidar) and fuse detection information in a common format using either 3D positions or Inverse Sensor Models. A GPU powered computational platform is able to run detection algorithms online. For the rgb camera, a deep learning algorithm is proposed DeepAnomaly to perform real-time anomaly detection of distant, heavy occluded and unknown obstacles in agriculture. DeepAnomaly is -- compared to a state-of-the-art object detector Faster R-CNN -- for an agricultural use-case able to detect humans better and at longer ranges (45-90m) using a smaller memory footprint and 7.3-times faster processing. Low memory footprint and fast processing makes DeepAnomaly suitable for real-time applications running on an embedded GPU. FieldSAFE is a multi-modal dataset for detection of static and moving obstacles in agriculture. The dataset includes synchronized recordings from a rgb camera, stereo camera, thermal camera, 360-degree camera, lidar and radar. Precise localization and pose is provided using IMU and GPS. Ground truth of static and moving obstacles (humans, mannequin dolls, barrels, buildings, vehicles, and vegetation) are available as an annotated orthophoto and GPS coordinates for moving obstacles. Detection information from multiple detection algorithms and sensors are fused into a map using Inverse Sensor Models and occupancy grid maps. This thesis presented many scientific contribution and state-of-the-art within perception for autonomous tractors; this includes a dataset, sensor platform, detection algorithms and procedures to perform multi-sensor fusion. Furthermore, important engineering contributions to autonomous farming vehicles are presented such as easily applicable, open-source software packages and algorithms that have been demonstrated in an end-to-end real-time detection system. The contributions of this thesis have demonstrated, addressed and solved critical issues to utilize camera-based perception systems that are essential to make autonomous vehicles in agriculture a reality
Design and implementation of an SDR-based multi-frequency ground-based SAR system
Synthetic Aperture Radar (SAR) has proven a valuable tool in the monitoring of the Earth, either at a global or local scales. SAR is a coherent radar system able to image extended areas with high resolution, and finds applications in many areas such as forestry, agriculture, mining, structure inspection or security operations.
Although space-borne SAR systems can image extended areas, their main limitation is the long revisit times, which are not suitable for applications where the target experiments rapid changes, in the scale of minutes to few days. GBSAR systems have proven useful to fill this revisit time gap by imaging relatively small areas continuously, with extensions usually smaller than a few square kilometers. Ground Based SAR (GBSAR) systems have been used extensively for the monitoring of slope instability, and are a common tool in the mining sector.
The development of the GBSAR is relatively recent, and various developments have taken place since the 2000s, transitioning from the usage of Vector Network Analyzers (VNAs) to custom radar cores tailored for this application. This transition is accompanied by a reduction in cost, but at the same time is accompanied by a loss of operational flexibility. Specifically, most GBSAR sensors now operate at a single frequency, losing the value of the multi-band operation that VNAs provided.
This work is motivated by the idea that it is worth to use the value of multi-frequency GBSAR measurements, while maintaining a limited system cost. In order to implement a GBSAR with these characteristics, it is realized that Software Defined Radio (SDR) devices are a good option for fast and flexible implementation of broadband transceivers.
This thesis details the design and implementation process of an SDR-based Frequency Modulated Continuous Wave (FMCW) GBSAR system from the ground up, presenting the main issues related with the usage of the most common SDR analog architecture, the Zero-IF transceiver. The main problem is determined to be the behavior of spurs related to IQ imbalances of the analog transceiver with the FMCW demodulation process. Two effective techniques to overcome these issues, the Super Spatial Variant Apodization (SSVA) and the Short Time Fourier Transform (STFT) signal reconstruction techniques, are implemented and tested. The thesis also deals with the digital implementation of the signal generator and digital receiver, which are implemented on top of an RF Network-on-Chip (RFNoC) architecture in the SDR Field Programmable Gate Array (FPGA). Another important aspect of this work is the development of an radiofrequency front-end that extends the capabilities of the SDR, implementing filtering, amplification, leakage mitigation and up-conversion to X-band. Finally, a set of test campaigns is described, in which the operation of the system is verified and the value of multi-frequency GBSAR observations is shown.El radar d'obertura sintètica (SAR) ha demostrat ser una eina valuosa en el monitoratge de la Terra, sigui a escala global o local. El SAR és un sistema de radar coherent capaç d’obtenir imatges de zones extenses amb alta resolució i té aplicacions en moltes à rees com la silvicultura, l’agricultura, la mineria, la inspecció d’estructures o les operacions de seguretat. Tot i que els sistemes SAR embarcats en plataformes orbitals poden obtenir imatges d'à rees extenses, la seva principal limitació és el temps de revisita, que no són adequats per a aplicacions on l'objectiu experimenta canvis rà pids, en una escala de minuts a pocs dies. Els sistemes GBSAR han demostrat ser útils per omplir aquesta bretxa de temps, obtenint imatges d'à rees relativament petites de manera contÃnua, amb extensions generalment inferiors a uns pocs quilòmetres quadrats. Els sistemes SAR terrestres (GBSAR) s’han utilitzat à mpliament per al control de la inestabilitat de talussos i esllavissades i són una eina comuna al sector miner. El desenvolupament del GBSAR és relativament recent i s’han produït diversos desenvolupaments des de la dècada de 2000, passant de l’ús d’analitzadors de xarxes vectorials (VNA) a nuclis de radar personalitzats i adaptats a aquesta aplicació. Aquesta transició s’acompanya d’una reducció del cost, però al mateix temps d’una pèrdua de flexibilitat operativa. Concretament, la majoria dels sensors GBSAR funcionen a una única freqüència, perdent el valor de l’operació en múltiples bandes que proporcionaven els VNA. Aquesta tesi està motivada per la idea de recuperar el valor de les mesures GBSAR multifreqüència, mantenint un cost del sistema limitat. Per tal d’implementar un GBSAR amb aquestes caracterÃstiques, s’adona que els dispositius de rà dio definida per software (SDR) són una bona opció per a la implementació rà pida i flexible dels transceptors de banda ampla. Aquesta tesi detalla el procés de disseny i implementació d’un sistema GBSAR d’ona contÃnua modulada en freqüència (FMCW) basat en la tecnologia SDR, presentant els principals problemes relacionats amb l’ús de l’arquitectura analògica de SDR més comuna, el transceptor Zero-IF. Es determina que el problema principal és el comportament dels espuris relacionats amb el balanç de les cadenes de fase i quadratura del transceptor analògic amb el procés de desmodulació FMCW. S’implementen i comproven dues tècniques efectives per minimitzar aquests problemes basades en la reconstrucció de la senyal contaminada per espuris: la tècnica anomenada Super Spatial Variant Apodization (SSVA) i una tècnica basada en la transformada de Fourier amb finestra (STFT). La tesi també tracta la implementació digital del generador de senyal i del receptor digital, que s’implementen sobre una arquitectura RF Network-on-Chip (RFNoC). Un altre aspecte important d’aquesta tesi és el desenvolupament d’un front-end de radiofreqüència que amplia les capacitats de la SDR, implementant filtratge, amplificació, millora de l'aïllament entre transmissió i recepció i conversió a banda X. Finalment, es descriu un conjunt de campanyes de prova en què es verifica el funcionament del sistema i es mostra el valor de les observacions GBSAR multifreqüència
Building Footprint Extraction from LiDAR Data and Imagery Information
This study presents an automatic method for regularisation of building outlines. Initially, building segments are extracted using a new fusion method. Data- and model-driven approaches are then combined to generate approximate building polygons. The core part of the method includes a novel data-driven algorithm based on likelihood equation derived from the geometrical properties of a building. Finally, the Gauss-Helmert and Gauss-Markov models adjustment are implemented and modified for regularisation of building outlines considering orthogonality constraints
Extracting Physical and Environmental Information of Irish Roads Using Airborne and Mobile Sensors
Airborne sensors including LiDAR and digital cameras are now used extensively for capturing topographical information as these are often more economical and efficient as compared to the traditional photogrammetric and land surveying techniques. Data captured using airborne sensors can be used to extract 3D information important for, inter alia, city modelling, land use classification and urban planning. According to the EU noise directive (2002/49/EC), the National Road Authority (NRA) in Ireland is responsible for generating noise models for all roads which are used by more than 8,000 vehicles per day. Accordingly, the NRA has to cover approximately 4,000 km of road, 500m on each side. These noise models have to be updated every 5 years. Important inputs to noise model are digital terrain model (DTM), 3D building data, road width, road centre line, ground surface type and noise barriers. The objective of this research was to extract these objects and topographical information using nationally available datasets acquired from the Ordnance Survey of Ireland (OSI). The OSI uses ALS50-II LiDAR and ADS40 digital sensors for capturing ground information. Both sensors rely on direct georeferencing, minimizing the need for ground control points. Before exploiting the complementary nature of both datasets for information extraction, their planimetric and vertical accuracies were evaluated using independent ground control points. A new method was also developed for registration in case of any mismatch. DSMs from LiDAR and aerial images were used to find common points to determine the parameters of 2D conformal transformation. The developed method was also evaluated by the EuroSDR in a project which involved a number of partners. These measures were taken to ensure that the inputs to the noise model were of acceptable accuracy as recommended in the report (Assessment of Exposure to Noise, 2006) by the European Working Group. A combination of image classification techniques was used to extract information by the fusion of LiDAR and aerial images. The developed method has two phases, viz. object classification and object reconstruction. Buildings and vegetation were classified based on Normalized Difference Vegetation Index (NDVI) and a normalized digital surface model (nDSM). Holes in building segments were filled by object-oriented multiresolution segmentation. Vegetation that remained amongst buildings was classified using cues obtained from LiDAR. The short comings there in were overcome by developing an additional classification cue using multiple returns. The building extents were extracted and assigned a single height value generated from LiDAR nDSM. The extracted height was verified against the ground truth data acquired using terrestrial survey techniques. Vegetation was further classified into three categories, viz. trees, hedges and tree clusters based on shape parameter (for hedges) and distance from neighbouring trees (for clusters). The ground was classified into three surface types i.e. roads and parking area, exposed surface and grass. This was done using LiDAR intensity, NDVI and nDSM. Mobile Laser Scanning (MLS) data was used to extract walls and purpose built noise barriers, since these objects were not extractable from the available airborne sensor data. Principal Component Analysis (PCA) was used to filter points belonging to such objects. A line was then fitted to these points using robust least square fitting. The developed object extraction method was tested objectively in two independent areas namely the Test Area-1 and the Test Area-2. The results were thoroughly investigated by three different accuracy assessment methods using the OSI vector data. The acceptance of any developed method for commercial applications requires completeness and correctness values of 85% and 70% respectively. Accuracy measures obtained using the developed method of object extraction recommend its applicability for noise modellin
Generalized Sparse Convolutional Neural Networks for Semantic Segmentation of Point Clouds Derived from Tri-Stereo Satellite Imagery
We studied the applicability of point clouds derived from tri-stereo satellite imagery for
semantic segmentation for generalized sparse convolutional neural networks by the example of
an Austrian study area. We examined, in particular, if the distorted geometric information, in addition
to color, influences the performance of segmenting clutter, roads, buildings, trees, and vehicles. In this
regard, we trained a fully convolutional neural network that uses generalized sparse convolution
one time solely on 3D geometric information (i.e., 3D point cloud derived by dense image matching),
and twice on 3D geometric as well as color information. In the first experiment, we did not use
class weights, whereas in the second we did. We compared the results with a fully convolutional
neural network that was trained on a 2D orthophoto, and a decision tree that was once trained on
hand-crafted 3D geometric features, and once trained on hand-crafted 3D geometric as well as color
features. The decision tree using hand-crafted features has been successfully applied to aerial laser
scanning data in the literature. Hence, we compared our main interest of study, a representation
learning technique, with another representation learning technique, and a non-representation learning
technique. Our study area is located in Waldviertel, a region in Lower Austria. The territory is
a hilly region covered mainly by forests, agriculture, and grasslands. Our classes of interest are heavily
unbalanced. However, we did not use any data augmentation techniques to counter overfitting. For our
study area, we reported that geometric and color information only improves the performance of the
Generalized Sparse Convolutional Neural Network (GSCNN) on the dominant class, which leads to a
higher overall performance in our case. We also found that training the network with median class
weighting partially reverts the effects of adding color. The network also started to learn the classes
with lower occurrences. The fully convolutional neural network that was trained on the 2D orthophoto
generally outperforms the other two with a kappa score of over 90% and an average per class accuracy
of 61%. However, the decision tree trained on colors and hand-crafted geometric features has a 2%
higher accuracy for roads
- …