124 research outputs found

    An Emulator Toolbox to Approximate Radiative Transfer Models with Statistical Learning

    Get PDF
    Physically-based radiative transfer models (RTMs) help in understanding the processes occurring on the Earth’s surface and their interactions with vegetation and atmosphere. When it comes to studying vegetation properties, RTMs allows us to study light interception by plant canopies and are used in the retrieval of biophysical variables through model inversion. However, advanced RTMs can take a long computational time, which makes them unfeasible in many real applications. To overcome this problem, it has been proposed to substitute RTMs through so-called emulators. Emulators are statistical models that approximate the functioning of RTMs. Emulators are advantageous in real practice because of the computational efficiency and excellent accuracy and flexibility for extrapolation. We hereby present an “Emulator toolbox” that enables analysing multi-output machine learning regression algorithms (MO-MLRAs) on their ability to approximate an RTM. The toolbox is included in the free-access ARTMO’s MATLAB suite for parameter retrieval and model inversion and currently contains both linear and non-linear MO-MLRAs, namely partial least squares regression (PLSR), kernel ridge regression (KRR) and neural networks (NN). These MO-MLRAs have been evaluated on their precision and speed to approximate the soil vegetation atmosphere transfer model SCOPE (Soil Canopy Observation, Photochemistry and Energy balance). SCOPE generates, amongst others, sun-induced chlorophyll fluorescence as the output signal. KRR and NN were evaluated as capable of reconstructing fluorescence spectra with great precision. Relative errors fell below 0.5% when trained with 500 or more samples using cross-validation and principal component analysis to alleviate the underdetermination problem. Moreover, NN reconstructed fluorescence spectra about 50-times faster and KRR about 800-times faster than SCOPE. The Emulator toolbox is foreseen to open new opportunities in the use of advanced RTMs, in which both consistent physical assumptions and data-driven machine learning algorithms live together

    Global sensitivity analysis of leaf-canopy-atmosphere RTMs: Implications for biophysical variables retrieval from top-of-atmosphere radiance data

    Get PDF
    Knowledge of key variables driving the top of the atmosphere (TOA) radiance over a vegetated surface is an important step to derive biophysical variables from TOA radiance data, e.g., as observed by an optical satellite. Coupled leaf-canopy-atmosphere Radiative Transfer Models (RTMs) allow linking vegetation variables directly to the at-sensor TOA radiance measured. Global Sensitivity Analysis (GSA) of RTMs enables the computation of the total contribution of each input variable to the output variance. We determined the impacts of the leaf-canopy-atmosphere variables into TOA radiance using the GSA to gain insights into retrievable variables. The leaf and canopy RTM PROSAIL was coupled with the atmospheric RTM MODTRAN5. Because of MODTRAN's computational burden and GSA's demand for many simulations, we first developed a surrogate statistical learning model, i.e., an emulator, that allows approximating RTM outputs through a machine learning algorithm with low computation time. A Gaussian process regression (GPR) emulator was used to reproduce lookup tables of TOA radiance as a function of 12 input variables with relative errors of 2.4%. GSA total sensitivity results quantified the driving variables of emulated TOA radiance along the 400-2500 nm spectral range at 15 cm-1 (between 0.3-9 nm); overall, the vegetation variables play a more dominant role than atmospheric variables. This suggests the possibility to retrieve biophysical variables directly from at-sensor TOA radiance data. Particularly promising are leaf chlorophyll content, leaf water thickness and leaf area index, as these variables are the most important drivers in governing TOA radiance outside the water absorption regions. A software framework was developed to facilitate the development of retrieval models from at-sensor TOA radiance data. As a proof of concept, maps of these biophysical variables have been generated for both TOA (L1C) and bottom-of-atmosphere (L2A) Sentinel-2 data by means of a hybrid retrieval scheme, i.e., training GPR retrieval algorithms using the RTM simulations. Obtained maps from L1C vs L2A data are consistent, suggesting that vegetation properties can be directly retrieved from TOA radiance data given a cloud-free sky, thus without the need of an atmospheric correction

    Integrating Physics Modelling with Machine Learning for Remote Sensing

    Get PDF
    L’observació de la Terra a partir de les dades proporcionades per sensors abord de satèl·lits, així com les proporcionades per models de transferència radiativa o climàtics, juntament amb les mesures in situ proporcionen una manera sense precedents de monitorar el nostre planeta amb millors resolucions espacials i temporals. La riquesa, quantitat i diversitat de les dades adquirides i posades a disposició també augmenta molt ràpidament. Aquestes dades ens permeten predir el rendiment dels cultius, fer un seguiment del canvi d’ús del sòl com ara la desforestació, supervisar i respondre als desastres naturals, i predir i mitigar el canvi climàtic. Per tal de fer front a tots aquests reptes, les dues darreres dècades han evidenciat un gran augment en l'aplicació d'algorismes d'aprenentatge automàtic en l'observació de la Terra. Amb l'anomenat `machine learning' es pot fer un ús eficient del flux de dades creixent en quantitat i diversitat. Els algorismes d'aprenentatge màquina, però, solen ser models agnòstics i massa flexibles i, per tant, acaben per no respectar les lleis fonamentals de la física. D’altra banda, en els darrers anys s’ha produït un augment de la investigació que intenta integrar el coneixement de física en algorismes d’aprenentatge, amb la finalitat d’obtenir solucions interpretables i que tinguin sentit físic. L’objectiu principal d’aquesta tesi és dissenyar diferents maneres de codificar el coneixement físic per proporcionar mètodes d’aprenentatge automàtic adaptats a problemes específics en teledetecció. Introduïm nous mètodes que poden fusionar de manera òptima fonts de dades heterogènies, explotar les regularitats de dades, incorporar equacions diferencials, obtenir models precisos que emulen, i per tant són coherents amb models físics, i models que aprenen parametrizacions del sistema combinant models i simulacions.Earth observation through satellite sensors, models and in situ measurements provides a way to monitor our planet with unprecedented spatial and temporal resolution. The amount and diversity of the data which is recorded and made available is ever-increasing. This data allows us to perform crop yield prediction, track land-use change such as deforestation, monitor and respond to natural disasters and predict and mitigate climate change. The last two decades have seen a large increase in the application of machine learning algorithms in Earth observation in order to make efficient use of the growing data-stream. Machine learning algorithms, however, are typically model agnostic and too flexible and so end up not respecting fundamental laws of physics. On the other hand there has, in recent years, been an increase in research attempting to embed physics knowledge in machine learning algorithms in order to obtain interpretable and physically meaningful solutions. The main objective of this thesis is to explore different ways of encoding physical knowledge to provide machine learning methods tailored for specific problems in remote sensing. Ways of expressing expert knowledge about the relevant physical systems in remote sensing abound, ranging from simple relations between reflectance indices and biophysical parameters to complex models that compute the radiative transfer of electromagnetic radiation through our atmosphere, and differential equations that explain the dynamics of key parameters. This thesis focuses on inversion problems, emulation of radiative transfer models, and incorporation of the abovementioned domain knowledge in machine learning algorithms for remote sensing applications. We explore new methods that can optimally model simulated and in-situ data jointly, incorporate differential equations in machine learning algorithms, handle more complex inversion problems and large-scale data, obtain accurate and computationally efficient emulators that are consistent with physical models, and that efficiently perform approximate Bayesian inversion over radiative transfer models

    Model-based quality assessment of tower-based field spectroscopy measurements

    Get PDF
    Recent and upcoming satellite missions providing high-quality spectrometric measurements are used for vegetation monitoring and studies of ecosystem functioning which are becoming increasingly important in the context of climate change. The calibration and validation of these measurements are crucial but remain a challenge. The need for in-situ references is high and is expected to increase with the trend toward mini-satellites without onboard calibration systems. In-situ measurements however need to be validated themselves before being used as a reference for air- or space-borne sensors. Crossvalidation of measurements with additional independent measurements is established but costly. Three approaches using two Radiative Transfer Models (RTM) namely the library for Radiative transfer (libRadtran) and the Soil Canopy Observation of Photosynthesis and Energy Fluxes Model (SCOPE) were built to validate in-situ irradiance and radiance measurements based on simulations. The performance of the approaches was assessed from summer to late autumn and over a single clear-sky day resulting in an average Root Mean Square Relative Error (RMSRE) of below 10% for irradiance simulations and 10%-38% RMSRE for radiance simulations compared to in-situ measurements. The higher RMSRE of radiance simulations originates in misspecifications of the reflectance spectrum which is either assumed constant (approach 1) or modelled (approach 2 & 3) based on vegetation parameters. The vegetation parameters however are themselves subject to large uncertainty. Shadowing on the vegetation canopy can additionally lead to ill-posed vegetation parameter selection. The experiments show the potential of coupled RTM-based quality assessment of high-frequency field measurements but also indicate the need for more accurate vegetation canopy parameter estimates and a more sophisticated optimization process to avoid the effects of ill-posedness

    A Survey on Gaussian Processes for Earth-Observation Data Analysis: A Comprehensive Investigation

    Get PDF
    Gaussian processes (GPs) have experienced tremendous success in biogeophysical parameter retrieval in the last few years. GPs constitute a solid Bayesian framework to consistently formulate many function approximation problems. This article reviews the main theoretical GP developments in the field, considering new algorithms that respect signal and noise characteristics, extract knowledge via automatic relevance kernels to yield feature rankings automatically, and allow applicability of associated uncertainty intervals to transport GP models in space and time that can be used to uncover causal relations between variables and can encode physically meaningful prior knowledge via radiative transfer model (RTM) emulation. The important issue of computational efficiency will also be addressed. These developments are illustrated in the field of geosciences and remote sensing at local and global scales through a set of illustrative examples. In particular, important problems for land, ocean, and atmosphere monitoring are considered, from accurately estimating oceanic chlorophyll content and pigments to retrieving vegetation properties from multi- and hyperspectral sensors as well as estimating atmospheric parameters (e.g., temperature, moisture, and ozone) from infrared sounders

    District Heating Network Demand Prediction Using a Physics-Based Energy Model with a Bayesian Approach for Parameter Calibration

    Get PDF
    Heat demand of a district heating network needs to be accurately predicted and managed to reduce consumption and emissions. Detailed thermal parameters are essential for predictions using physics-based energy models, but they are not always available or sufficiently accurate. To reduce the simulation time in calibration and the dependency of accurate data of buildings, this paper develops a prediction approach using a building energy model whose parameters are calibrated by Bayesian-based calibration method to match the recorded data of monthly heat demand. In the proposed calibration approach, an emulator is established to evaluate the untested values of thermal parameters using Bayesian method, and then use the evaluation results to search for the most suitable parameters value. The designed approach greatly accelerates the calibration speed. The method is used to calibrate a single parameter and multiple parameters of the building thermal energy models for a district heating network. After it has been verified with measured data, the developed calibration method is used to calibrate parameters of building energy models. The output of the calibrated model can predict the hourly building heat demand in district heating networks

    Quantifying Vegetation Biophysical Variables from Imaging Spectroscopy Data: A Review on Retrieval Methods

    Get PDF
    An unprecedented spectroscopic data stream will soon become available with forthcoming Earth-observing satellite missions equipped with imaging spectroradiometers. This data stream will open up a vast array of opportunities to quantify a diversity of biochemical and structural vegetation properties. The processing requirements for such large data streams require reliable retrieval techniques enabling the spatiotemporally explicit quantification of biophysical variables. With the aim of preparing for this new era of Earth observation, this review summarizes the state-of-the-art retrieval methods that have been applied in experimental imaging spectroscopy studies inferring all kinds of vegetation biophysical variables. Identified retrieval methods are categorized into: (1) parametric regression, including vegetation indices, shape indices and spectral transformations; (2) nonparametric regression, including linear and nonlinear machine learning regression algorithms; (3) physically based, including inversion of radiative transfer models (RTMs) using numerical optimization and look-up table approaches; and (4) hybrid regression methods, which combine RTM simulations with machine learning regression methods. For each of these categories, an overview of widely applied methods with application to mapping vegetation properties is given. In view of processing imaging spectroscopy data, a critical aspect involves the challenge of dealing with spectral multicollinearity. The ability to provide robust estimates, retrieval uncertainties and acceptable retrieval processing speed are other important aspects in view of operational processing. Recommendations towards new-generation spectroscopy-based processing chains for operational production of biophysical variables are given

    Framework for emulation and uncertainty quantification of a stochastic building performance simulator

    Get PDF
    A good framework for the quantification and decomposition of uncertainties in dynamic building performance simulation should: (i) simulate the principle deterministic processes influencing heat flows and the stochastic perturbations to them, (ii) quantify and decompose the total uncertainty into its respective sources, and the interactions between them, and (iii) achieve this in a computationally efficient manner. In this paper we introduce a new framework which, for the first time, does just that. We present the detailed development of this framework for emulating the mean and the variance in the response of a stochastic building performance simulator (EnergyPlus co-simulated with a multi agent stochastic simulator called No-MASS), for heating and cooling load predictions. We demonstrate and evaluate the effectiveness of these emulators, applied to a monozone office building. With a range of 25–50 kWh/m2, the epistemic uncertainty due to envelope parameters dominates over aleatory uncertainty relating to occupants' interactions, which ranges from 6–8 kWh/m2, for heating loads. The converse is observed for cooling loads, which vary by just 3 kWh/m2 for envelope parameters, compared with 8–22 kWh/m2 for their aleatory counterparts. This is due to the larger stimuli provoking occupants' interactions. Sensitivity indices corroborate this result, with wall insulation thickness (0.97) and occupants' behaviours (0.83) having the highest impacts on heating and cooling load predictions respectively. This new emulator framework (including training and subsequent deployment) achieves a factor of c.30 reduction in the total computational budget, whilst overwhelmingly maintaining predictions within a 95% confidence interval, and successfully decomposing prediction uncertainties
    corecore