2,878 research outputs found
Evaluation of modelling approaches for predicting the spatial distribution of soil organic carbon stocks at the national scale
Soil organic carbon (SOC) plays a major role in the global carbon budget. It
can act as a source or a sink of atmospheric carbon, thereby possibly
influencing the course of climate change. Improving the tools that model the
spatial distributions of SOC stocks at national scales is a priority, both for
monitoring changes in SOC and as an input for global carbon cycles studies. In
this paper, we compare and evaluate two recent and promising modelling
approaches. First, we considered several increasingly complex boosted
regression trees (BRT), a convenient and efficient multiple regression model
from the statistical learning field. Further, we considered a robust
geostatistical approach coupled to the BRT models. Testing the different
approaches was performed on the dataset from the French Soil Monitoring
Network, with a consistent cross-validation procedure. We showed that when a
limited number of predictors were included in the BRT model, the standalone BRT
predictions were significantly improved by robust geostatistical modelling of
the residuals. However, when data for several SOC drivers were included, the
standalone BRT model predictions were not significantly improved by
geostatistical modelling. Therefore, in this latter situation, the BRT
predictions might be considered adequate without the need for geostatistical
modelling, provided that i) care is exercised in model fitting and validating,
and ii) the dataset does not allow for modelling of local spatial
autocorrelations, as is the case for many national systematic sampling schemes
Novel MLR-RF-Based Geospatial Techniques: A Comparison with OK
Geostatistical estimation methods rely on experimental variograms that are mostly erratic, leading to subjective model fitting and assuming normal distribution during conditional simula-tions. In contrast, Machine Learning Algorithms (MLA) are (1) free of such limitations, (2) can in-corporate information from multiple sources and therefore emerge with increasing interest in real-time resource estimation and automation. However, MLAs need to be explored for robust learning of phenomena, better accuracy, and computational efficiency. This paper compares MLAs, i.e., Multiple Linear Regression (MLR) and Random Forest (RF), with Ordinary Kriging (OK). The techniques were applied to the publicly available Walkerlake dataset, while the exhaustive Walker Lake dataset was validated. The results of MLR were significant (p \u3c 10 × 10−5), with correlation coeffi-cients of 0.81 (R-square = 0.65) compared to 0.79 (R-square = 0.62) from the RF and OK methods. Additionally, MLR was automated (free from an intermediary step of variogram modelling as in OK), produced unbiased estimates, identified key samples representing different zones, and had higher computational efficiency
Polynomial-Chaos-based Kriging
Computer simulation has become the standard tool in many engineering fields
for designing and optimizing systems, as well as for assessing their
reliability. To cope with demanding analysis such as optimization and
reliability, surrogate models (a.k.a meta-models) have been increasingly
investigated in the last decade. Polynomial Chaos Expansions (PCE) and Kriging
are two popular non-intrusive meta-modelling techniques. PCE surrogates the
computational model with a series of orthonormal polynomials in the input
variables where polynomials are chosen in coherency with the probability
distributions of those input variables. On the other hand, Kriging assumes that
the computer model behaves as a realization of a Gaussian random process whose
parameters are estimated from the available computer runs, i.e. input vectors
and response values. These two techniques have been developed more or less in
parallel so far with little interaction between the researchers in the two
fields. In this paper, PC-Kriging is derived as a new non-intrusive
meta-modeling approach combining PCE and Kriging. A sparse set of orthonormal
polynomials (PCE) approximates the global behavior of the computational model
whereas Kriging manages the local variability of the model output. An adaptive
algorithm similar to the least angle regression algorithm determines the
optimal sparse set of polynomials. PC-Kriging is validated on various benchmark
analytical functions which are easy to sample for reference results. From the
numerical investigations it is concluded that PC-Kriging performs better than
or at least as good as the two distinct meta-modeling techniques. A larger gain
in accuracy is obtained when the experimental design has a limited size, which
is an asset when dealing with demanding computational models
Self-Calibration Methods for Uncontrolled Environments in Sensor Networks: A Reference Survey
Growing progress in sensor technology has constantly expanded the number and
range of low-cost, small, and portable sensors on the market, increasing the
number and type of physical phenomena that can be measured with wirelessly
connected sensors. Large-scale deployments of wireless sensor networks (WSN)
involving hundreds or thousands of devices and limited budgets often constrain
the choice of sensing hardware, which generally has reduced accuracy,
precision, and reliability. Therefore, it is challenging to achieve good data
quality and maintain error-free measurements during the whole system lifetime.
Self-calibration or recalibration in ad hoc sensor networks to preserve data
quality is essential, yet challenging, for several reasons, such as the
existence of random noise and the absence of suitable general models.
Calibration performed in the field, without accurate and controlled
instrumentation, is said to be in an uncontrolled environment. This paper
provides current and fundamental self-calibration approaches and models for
wireless sensor networks in uncontrolled environments
Towards an operational model for estimating day and night instantaneous near-surface air temperature for urban heat island studies: outline and assessment
Near-surface air temperature (NSAT) is key for assessing urban heat islands, human health, and well-being. However, a widely recognized and cost- and time-effective replicable approach for estimating hourly NSAT is still urgent. In this study, we outline and validate an easy-to-replicate, yet effective, operational model, for automating the estimation of high-resolution day and night instantaneous NSAT. The model is tested on a heat wave event and for a large geographical area. The model combines remotely sensed land surface temperature and digital elevation model, with air temperature from local fixed weather station networks. Achieved NSAT has daily and hourly frequency consistent with MODIS revisiting time. A geographically weighted regression method is employed, with exponential weighting found to be highly accurate for our purpose. A robust assessment of different methods, at different time slots, both day- and night-time, and during a heatwave event, is provided based on a cross-validation protocol. Four-time periods are modelled and tested, for two consecutive days, i.e. 31st of July 2020 at 10:40 and 21:50, and 1st of August 2020 at 02:00 and 13:10 local time. High R2 was found for all time slots, ranging from 0.82 to 0.88, with a bias close to 0, RMSE ranging from 1.45 °C to 1.77 °C, and MAE from 1.15 °C to 1.36 °C. Normalized RMSE and MAE are roughly 0.05 to 0.08. Overall, if compared to other recognized regression models, higher effectiveness is allowed also in terms of spatial autocorrelation of residuals, as well as in terms of model sensitivity
- …