2,323 research outputs found
Self-supervised monocular depth estimation from oblique UAV videos
UAVs have become an essential photogrammetric measurement as they are
affordable, easily accessible and versatile. Aerial images captured from UAVs
have applications in small and large scale texture mapping, 3D modelling,
object detection tasks, DTM and DSM generation etc. Photogrammetric techniques
are routinely used for 3D reconstruction from UAV images where multiple images
of the same scene are acquired. Developments in computer vision and deep
learning techniques have made Single Image Depth Estimation (SIDE) a field of
intense research. Using SIDE techniques on UAV images can overcome the need for
multiple images for 3D reconstruction. This paper aims to estimate depth from a
single UAV aerial image using deep learning. We follow a self-supervised
learning approach, Self-Supervised Monocular Depth Estimation (SMDE), which
does not need ground truth depth or any extra information other than images for
learning to estimate depth. Monocular video frames are used for training the
deep learning model which learns depth and pose information jointly through two
different networks, one each for depth and pose. The predicted depth and pose
are used to reconstruct one image from the viewpoint of another image utilising
the temporal information from videos. We propose a novel architecture with two
2D CNN encoders and a 3D CNN decoder for extracting information from
consecutive temporal frames. A contrastive loss term is introduced for
improving the quality of image generation. Our experiments are carried out on
the public UAVid video dataset. The experimental results demonstrate that our
model outperforms the state-of-the-art methods in estimating the depths.Comment: Submitted to ISPRS Journal of Photogrammetry and Remote Sensin
Recommended from our members
UAV Oblique Imagery with an Adaptive Micro-Terrain Model for Estimation of Leaf Area Index and Height of Maize Canopy from 3D Point Clouds
Leaf area index (LAI) and height are two critical measures of maize crops that are used in ecophysiological and morphological studies for growth evaluation, health assessment, and yield prediction. However, mapping spatial and temporal variability of LAI in fields using handheld tools and traditional techniques is a tedious and costly pointwise operation that provides information only within limited areas. The objective of this study was to evaluate the reliability of mapping LAI and height of maize canopy from 3D point clouds generated from UAV oblique imagery with the adaptive micro-terrain model. The experiment was carried out in a field planted with three cultivars having different canopy shapes and four replicates covering a total area of 48 Ă— 36 m. RGB images in nadir and oblique view were acquired from the maize field at six different time slots during the growing season. Images were processed by Agisoft Metashape to generate 3D point clouds using the structure from motion method and were later processed by MATLAB to obtain clean canopy structure, including height and density. The LAI was estimated by a multivariate linear regression model using crop canopy descriptors derived from the 3D point cloud, which account for height and leaf density distribution along the canopy height. A simulation analysis based on the Sine function effectively demonstrated the micro-terrain model from point clouds. For the ground truth data, a randomized block design with 24 sample areas was used to manually measure LAI, height, N-pen data, and yield during the growing season. It was found that canopy height data from the 3D point clouds has a relatively strong correlation (R2 = 0.89, 0.86, 0.78) with the manual measurement for three cultivars with CH90 . The proposed methodology allows a cost-effective high-resolution mapping of in-field LAI index extraction through UAV 3D data to be used as an alternative to the conventional LAI assessments even in inaccessible regions
Real-Time Dense 3D Reconstruction from Monocular Video Data Captured by Low-Cost UAVS
Real-time 3D reconstruction enables fast dense mapping of the environment which benefits numerous applications, such as navigation or live evaluation of an emergency. In contrast to most real-time capable approaches, our method does not need an explicit depth sensor. Instead, we only rely on a video stream from a camera and its intrinsic calibration. By exploiting the self-motion of the unmanned aerial vehicle (UAV) flying with oblique view around buildings, we estimate both camera trajectory and depth for selected images with enough novel content. To create a 3D model of the scene, we rely on a three-stage processing chain. First, we estimate the rough camera trajectory using a simultaneous localization and mapping (SLAM) algorithm. Once a suitable constellation is found, we estimate depth for local bundles of images using a Multi-View Stereo (MVS) approach and then fuse this depth into a global surfel-based model. For our evaluation, we use 55 video sequences with diverse settings, consisting of both synthetic and real scenes. We evaluate not only the generated reconstruction but also the intermediate products and achieve competitive results both qualitatively and quantitatively. At the same time, our method can keep up with a 30 fps video for a resolution of 768 × 448 pixels
Automatic Registration of Optical Aerial Imagery to a LiDAR Point Cloud for Generation of City Models
This paper presents a framework for automatic registration of both the optical and 3D structural information extracted from oblique aerial imagery to a Light Detection and Ranging (LiDAR) point cloud without prior knowledge of an initial alignment. The framework employs a coarse to fine strategy in the estimation of the registration parameters. First, a dense 3D point cloud and the associated relative camera parameters are extracted from the optical aerial imagery using a state-of-the-art 3D reconstruction algorithm. Next, a digital surface model (DSM) is generated from both the LiDAR and the optical imagery-derived point clouds. Coarse registration parameters are then computed from salient features extracted from the LiDAR and optical imagery-derived DSMs. The registration parameters are further refined using the iterative closest point (ICP) algorithm to minimize global error between the registered point clouds.
The novelty of the proposed approach is in the computation of salient features from the DSMs, and the selection of matching salient features using geometric invariants coupled with Normalized Cross Correlation (NCC) match validation. The feature extraction and matching process enables the automatic estimation of the coarse registration parameters required for initializing the fine registration process. The registration framework is tested on a simulated scene and aerial datasets acquired in real urban environments. Results demonstrates the robustness of the framework for registering optical and 3D structural information extracted from aerial imagery to a LiDAR point cloud, when co-existing initial registration parameters are unavailable
An Approach Of Automatic Reconstruction Of Building Models For Virtual Cities From Open Resources
Along with the ever-increasing popularity of virtual reality technology in recent years, 3D city models have been used in different applications, such as urban planning, disaster management, tourism, entertainment, and video games. Currently, those models are mainly reconstructed from access-restricted data sources such as LiDAR point clouds, airborne images, satellite images, and UAV (uncrewed air vehicle) images with a focus on structural illustration of buildings’ contours and layouts. To help make 3D models closer to their real-life counterparts, this thesis research proposes a new approach for the automatic reconstruction of building models from open resources. In this approach, first, building shapes are reconstructed by using the structural and geographic information retrievable from the open repository of OpenStreetMap (OSM). Later, images available from the street view of Google maps are used to extract information of the exterior appearance of buildings for texture mapping onto their boundaries. The constructed 3D environment is used as prior knowledge for the navigation purposes in a self-driving car. The static objects from the 3D model are compared with the real-time images of static objects to reduce the computation time by eliminating them from the detection proces
UAVD4L: A Large-Scale Dataset for UAV 6-DoF Localization
Despite significant progress in global localization of Unmanned Aerial
Vehicles (UAVs) in GPS-denied environments, existing methods remain constrained
by the availability of datasets. Current datasets often focus on small-scale
scenes and lack viewpoint variability, accurate ground truth (GT) pose, and UAV
build-in sensor data. To address these limitations, we introduce a large-scale
6-DoF UAV dataset for localization (UAVD4L) and develop a two-stage 6-DoF
localization pipeline (UAVLoc), which consists of offline synthetic data
generation and online visual localization. Additionally, based on the 6-DoF
estimator, we design a hierarchical system for tracking ground target in 3D
space. Experimental results on the new dataset demonstrate the effectiveness of
the proposed approach. Code and dataset are available at
https://github.com/RingoWRW/UAVD4
Historical aerial photographs for landslide assessment: two case histories
This paper demonstrates the value of historical aerial photographs for assessing long-term landslide evolution. The study focussed on two case histories, the Mam Tor and East Pentwyn landslides. In both case histories the variety of data was explored, that could be derived relatively easily using an ordinary PC desktop, commercially available software and commonly available photographic material. The techniques to unlock qualitative and quantitative data captured in the photographic archive were based on the principles of aerial photo-interpretation and photogrammetry. The created products comprised geomorphological maps, automatically derived elevation models (DEMs), displacement vectors and animations.
The measured horizontal displacements of the Mam Tor landslide ranged from 0.09-0.74 m/yr between 1953 and 1999, which was verified by independent survey data. Moreover, the observed displacement patterns were consistent with photo-interpreted geomorphological information. The photogrammetric measurements from the East Pentwyn landslide (horizontal displacements up to 6 m/yr between 1971 and 1973) also showed a striking resemblance to independent data. In both case histories, the vertical accuracy was insufficient for detecting significant elevation changes. Nevertheless, DEMs proved to be a powerful tool for visualisation. Overall, the results in this study validated the techniques used and strongly encourage the use of historical photographic material in landslide studies
- …