4,103 research outputs found

    A comparison of machine learning regression techniques for LiDAR-derived estimation of forest variables

    Get PDF
    Light Detection and Ranging (LiDAR) is a remote sensor able to extract three-dimensional information. Environmental models in forest areas have been benefited by the use of LiDAR-derived information in the last years. A multiple linear regression (MLR) with previous stepwise feature selection is the most common method in the literature to develop those models. MLR defines the relation between the set of field measurements and the statistics extracted from a LiDAR flight. Machine learning has emerged as a suitable tool to improve classic stepwise MLR results on LiDAR. Unfortunately, few studies have been proposed to compare the quality of the multiple machine learning approaches. This paper presents a comparison between the classic MLR-based methodology and regression techniques in machine learning (neural networks, support vector machines, nearest neighbour, ensembles such as random forests) with special emphasis on regression trees. The selected techniques are applied to real LiDAR data from two areas in the province of Lugo (Galizia, Spain). The results confirm that classic MLR is outperformed by machine learning techniques and concretely, our experiments suggest that Support Vector Regression with Gaussian kernels statistically outperforms the rest of the techniques.Ministerio de Ciencia y Tecnología TIN2011-28956-C02Junta de Andalucía P12- TIC-1728Universidad Pablo de Olavide APPB81309

    A Comparative Study of Machine Learning Regression Methods on LiDAR Data: A Case Study

    Get PDF
    Light Detection and Ranging (LiDAR) is a remote sensor able to extract vertical information from sensed objects. LiDAR-derived information is nowadays used to develop environmental models for describing fire behaviour or quantifying biomass stocks in forest areas. A multiple linear regression (MLR) with previous stepwise feature selection is the most common method in the literature to develop LiDAR-derived models. MLR defines the relation between the set of field measurements and the statistics extracted from a LiDAR flight. Machine learning has recently been paid an increasing attention to improve classic MLR results. Unfortunately, few studies have been proposed to compare the quality of the multiple machine learning approaches. This paper presents a comparison between the classic MLR-based methodology and common regression techniques in machine learning (neural networks, regression trees, support vector machines, nearest neighbour, and ensembles such as random forests). The selected techniques are applied to real LiDAR data from two areas in the province of Lugo (Galizia, Spain). The results show that support vector regression statistically outperforms the rest of techniques when feature selection is applied. However, its performance cannot be said statistically different from that of Random Forests when previous feature selection is skipped

    Forest Aboveground Biomass Estimation Using Multi-Source Remote Sensing Data in Temperate Forests

    Get PDF
    Forests are a crucial part of global ecosystems. Accurately estimating aboveground biomass (AGB) is important in many applications including monitoring carbon stocks, investigating forest degradation, and designing sustainable forest management strategies. Remote sensing techniques have proved to be a cost-effective way to estimate forest AGB with timely and repeated observations. This dissertation investigated the use of multiple remotely sensed datasets for forest AGB estimation in temperate forests. We compared the performance of Landsat and lidar data—individually and fused—for estimating AGB using multiple regression models (MLR), Random Forest (RF) and Geographically Weight Regression (GWR). Our approach showed MLR performed similarly to GWR and both were better than RF. Integration of lidar and Landsat inputs outperformed either data source alone. However, although lidar provides valuable three-dimensional forest structure information, acquiring comprehensive lidar coverage is often cost prohibitive. Thus we developed a lidar sampling framework to support AGB estimation from Landsat images. We compared two sampling strategies—systematic and classification-based—and found that the systematic sampling selection method was highly dependent on site conditions and had higher model variability. The classification-based lidar sampling strategy was easy to apply and provides a framework that is readily transferable to new study sites. The performance of Sentinel-2 and Landsat 8 data for quantifying AGB in a temperate forest using RF regression was also tested. We modeled AGB using three datasets: Sentinel-2, Landsat 8, and a pseudo dataset that retained the spatial resolution of Sentinel-2 but only the spectral bands that matched those on Landsat 8. We found that while RF model parameters impact model outcomes, it is more important to focus attention on variable selection. Our results showed that the incorporation of red-edge information increased AGB estimation accuracy by approximately 6%. The additional spatial resolution improved accuracy by approximately 3%. The variable importance ranks in the RF regression model showed that in addition to the red- edge bands, the shortwave infrared bands were important either individually (in the Sentinel-2 model) or in band indices. With the growing availability of remote sensing datasets, developing tools to appropriately and efficiently apply remote sensing data is increasingly important

    Combined Impact of Sample Size and Modeling Approaches for Predicting Stem Volume in Eucalyptus spp. Forest Plantations Using Field and LiDAR Data

    Get PDF
    Light Detection and Ranging (LiDAR) remote sensing has been established as one of the most promising tools for large-scale forest monitoring and mapping. Continuous advances in computational techniques, such as machine learning algorithms, have been increasingly improving our capability to model forest attributes accurately and at high spatial and temporal resolution. While there have been previous studies exploring the use of LiDAR and machine learning algorithms for forest inventory modeling, as yet, no studies have demonstrated the combined impact of sample size and different modeling techniques for predicting and mapping stem total volume in industrial Eucalyptus spp. tree plantations. This study aimed to compare the combined effects of parametric and nonparametric modeling methods for estimating volume in Eucalyptus spp. tree plantation using airborne LiDAR data while varying the reference data (sample size). The modeling techniques were compared in terms of root mean square error (RMSE), bias, and R2 with 500 simulations. The best performance was verified for the ordinary least-squares (OLS) method, which was able to provide comparable results to the traditional forest inventory approaches using only 40% (n = 63; ~0.04 plots/ha) of the total field plots, followed by the random forest (RF) algorithm with identical sample size values. This study provides solutions for increasing the industry efficiency in monitoring and managing forest plantation stem volume for the paper and pulp supply chain

    A Preliminary Study of the Suitability of Deep Learning to Improve LiDAR-Derived Biomass Estimation

    Get PDF
    Light Detection and Ranging (LiDAR) is a remote sensor able to extract three-dimensional information about forest structure. Bio physical models have taken advantage of the use of LiDAR-derived infor mation to improve their accuracy. Multiple Linear Regression (MLR) is the most common method in the literature regarding biomass estima tion to define the relation between the set of field measurements and the statistics extracted from a LiDAR flight. Unfortunately, there exist open issues regarding the generalization of models from one area to another due to the lack of knowledge about noise distribution, relation ship between statistical features and risk of overfitting. Autoencoders (a type of deep neural network) has been applied to improve the results of machine learning techniques in recent times by undoing possible data corruption process and improving feature selection. This paper presents a preliminary comparison between the use of MLR with and without preprocessing by autoencoders on real LiDAR data from two areas in the province of Lugo (Galizia, Spain). The results show that autoen coders statistically increased the quality of MLR estimations by around 15–30%

    Estimation of total biomass in Aleppo pine forest stands applying parametric and nonparametric methods to low-density airborne laser scanning data

    Get PDF
    The account of total biomass can assist with the evaluation of climate regulation policies from local to global scales. This study estimates total biomass (TB), including tree and shrub biomass fractions, in Pinus halepensis Miller forest stands located in the Aragon Region (Spain) using Airborne Laser Scanning (ALS) data and fieldwork. A comparison of five selection methods and five regression models was performed to relate the TB, estimated in 83 field plots through allometric equations, to several independent variables extracted from ALS point cloud. A height threshold was used to include returns above 0.2 m when calculating ALS variables. The sample was divided into training and test sets composed of 62 and 21 plots, respectively. The model with the lower root mean square error (15.14 tons/ha) after validation was the multiple linear regression model including three ALS variables: the 25th percentile of the return heights, the variance, and the percentage of first returns above the mean. This study confirms the usefulness of low-density ALS data to accurately estimate total biomass, and thus better assess the availability of biomass and carbon content in a Mediterranean Aleppo pine forest

    Delineation of high resolution climate regions over the Korean Peninsula using machine learning approaches

    Get PDF
    In this research, climate classification maps over the Korean Peninsula at 1 km resolution were generated using the satellite-based climatic variables of monthly temperature and precipitation based on machine learning approaches. Random forest (RF), artificial neural networks (ANN), k-nearest neighbor (KNN), logistic regression (LR), and support vector machines (SVM) were used to develop models. Training and validation of these models were conducted using in-situ observations from the Korea Meteorological Administration (KMA) from 2001 to 2016. The rule of the traditional Koppen-Geiger (K-G) climate classification was used to classify climate regions. The input variables were land surface temperature (LST) of the Moderate Resolution Imaging Spectroradiometer (MODIS), monthly precipitation data from the Tropical Rainfall Measuring Mission (TRMM) 3B43 product, and the Digital Elevation Map (DEM) from the Shuttle Radar Topography Mission (SRTM). The overall accuracy (OA) based on validation data from 2001 to 2016 for all models was high over 95%. DEM and minimum winter temperature were two distinct variables over the study area with particularly high relative importance. ANN produced more realistic spatial distribution of the classified climates despite having a slightly lower OA than the others. The accuracy of the models using high altitudinal in-situ data of the Mountain Meteorology Observation System (MMOS) was also assessed. Although the data length of the MMOS data was relatively short (2013 to 2017), it proved that the snowy, dry and cold winter and cool summer class (Dwc) is widely located in the eastern coastal region of South Korea. Temporal shifting of climate was examined through a comparison of climate maps produced by period: from 1950 to 2000, from 1983 to 2000, and from 2001 to 2013. A shrinking trend of snow classes (D) over the Korean Peninsula was clearly observed from the ANN-based climate classification results. Shifting trends of climate with the decrease/increase of snow (D)/temperate (C) classes were clearly shown in the maps produced using the proposed approaches, consistent with the results from the reanalysis data of the Climatic Research Unit (CRU) and Global Precipitation Climatology Centre (GPCC)
    corecore