349 research outputs found

    Validation of coastal forecasts

    Get PDF
    Deliverable D2.3 del Proyecto MyWave: A pan-European concerted and integrated approach to operational wave modelling and forecasting – a complement to GMES MyOcean services. Work programme topic: SPA.2011.1.5.03 – R&D to enhance future GMES applications in the Marine and Atmosphere areas.Funded under: FP7-SPACE-2011-284455

    Integrating Physics Modelling with Machine Learning for Remote Sensing

    Get PDF
    L’observació de la Terra a partir de les dades proporcionades per sensors abord de satèl·lits, així com les proporcionades per models de transferència radiativa o climàtics, juntament amb les mesures in situ proporcionen una manera sense precedents de monitorar el nostre planeta amb millors resolucions espacials i temporals. La riquesa, quantitat i diversitat de les dades adquirides i posades a disposició també augmenta molt ràpidament. Aquestes dades ens permeten predir el rendiment dels cultius, fer un seguiment del canvi d’ús del sòl com ara la desforestació, supervisar i respondre als desastres naturals, i predir i mitigar el canvi climàtic. Per tal de fer front a tots aquests reptes, les dues darreres dècades han evidenciat un gran augment en l'aplicació d'algorismes d'aprenentatge automàtic en l'observació de la Terra. Amb l'anomenat `machine learning' es pot fer un ús eficient del flux de dades creixent en quantitat i diversitat. Els algorismes d'aprenentatge màquina, però, solen ser models agnòstics i massa flexibles i, per tant, acaben per no respectar les lleis fonamentals de la física. D’altra banda, en els darrers anys s’ha produït un augment de la investigació que intenta integrar el coneixement de física en algorismes d’aprenentatge, amb la finalitat d’obtenir solucions interpretables i que tinguin sentit físic. L’objectiu principal d’aquesta tesi és dissenyar diferents maneres de codificar el coneixement físic per proporcionar mètodes d’aprenentatge automàtic adaptats a problemes específics en teledetecció. Introduïm nous mètodes que poden fusionar de manera òptima fonts de dades heterogènies, explotar les regularitats de dades, incorporar equacions diferencials, obtenir models precisos que emulen, i per tant són coherents amb models físics, i models que aprenen parametrizacions del sistema combinant models i simulacions.Earth observation through satellite sensors, models and in situ measurements provides a way to monitor our planet with unprecedented spatial and temporal resolution. The amount and diversity of the data which is recorded and made available is ever-increasing. This data allows us to perform crop yield prediction, track land-use change such as deforestation, monitor and respond to natural disasters and predict and mitigate climate change. The last two decades have seen a large increase in the application of machine learning algorithms in Earth observation in order to make efficient use of the growing data-stream. Machine learning algorithms, however, are typically model agnostic and too flexible and so end up not respecting fundamental laws of physics. On the other hand there has, in recent years, been an increase in research attempting to embed physics knowledge in machine learning algorithms in order to obtain interpretable and physically meaningful solutions. The main objective of this thesis is to explore different ways of encoding physical knowledge to provide machine learning methods tailored for specific problems in remote sensing. Ways of expressing expert knowledge about the relevant physical systems in remote sensing abound, ranging from simple relations between reflectance indices and biophysical parameters to complex models that compute the radiative transfer of electromagnetic radiation through our atmosphere, and differential equations that explain the dynamics of key parameters. This thesis focuses on inversion problems, emulation of radiative transfer models, and incorporation of the abovementioned domain knowledge in machine learning algorithms for remote sensing applications. We explore new methods that can optimally model simulated and in-situ data jointly, incorporate differential equations in machine learning algorithms, handle more complex inversion problems and large-scale data, obtain accurate and computationally efficient emulators that are consistent with physical models, and that efficiently perform approximate Bayesian inversion over radiative transfer models

    Machine learning methods for uncertainty quantification in subsurface reservoirs

    Get PDF
    We investigate current challenges in the reservoir engineering pipeline that can be addressed using recent machine learning techniques. Our emphasis is on improving the performance of uncertainty quantification tasks which are ubiquitous in subsurface reservoir simulations. In one work, we accelerate multiscale methods by embedding a neural network surrogate for the fast computation of the custom basis functions, replacing the need to solve the local elliptic problems normally required to obtain them. In a different work, we address current challenges in obtaining geological parametrizations that can capture complex geological structures. We adopt a neural network parametrization using a recent unsupervised learning technique, obtaining an effective parametrization that can reproduce high-order statistics of flow responses. In a follow-up work, we introduce a method for post-hoc conditioning of the neural network parametrization to generate conditional realizations by training a second neural network to sample from a Bayesian posterior and coupling it with the original network. In our final work, we introduce a framework for exemplar-based parametric synthesis of geological images based on a recent kernel method, obtaining a neural network parametrization of the geology using a single exemplar image

    Adaptive nonlinear control using fuzzy logic and neural networks

    Get PDF
    The problem of adaptive nonlinear control, i.e. the control of nonlinear dynamic systems with unknown parameters, is considered. Current techniques usually assume that either the control system is linearizable or the type of nonlinearity is known. This results in poor control quality for many practical problems. Moreover, the control system design becomes too complex for a practicing engineer. The objective of this thesis is to provide a practical, systematic approach for solving the problem of identification and control of nonlinear systems with unknown parameters, when the explicit linear parametrization is either unknown or impossible. Fuzzy logic (FL) and neural networks (NNs) have proven to be the tools for universal approximation, and hence are considered. However, FL requires expert knowledge and there is a lack of systematic procedures to design NNs for control. A hybrid technique, called fuzzy logic adaptive network (FLAN), which combines the structure of an FL controller with the learning aspects of the NNs is developed. FLAN is designed such that it is capable of both structure learning and parameter learning. Gradient descent based technique is utilized for the parameter learning in FLAN, and it is tested through a variety of simulated experiments in identification and control of nonlinear systems. The results indicate the success of FLAN in terms of accuracy of estimation, speed of convergence, insensitivity against a range of initial learning rates, robustness against sudden changes in the input as well as noise in the training data. The performance of FLAN is also compared with the techniques based on FL and NNs, as well as several hybrid techniques

    Quantifying Sea-Level Rise Hazard Addressing Climate Forcing and Projection Uncertainty

    Get PDF
    Sea-level rise, driven by climate change, puts coastal communities and ecosystems at risk. Major sources that contribute to sea-level rise include ocean thermal expansion, glacier loss, and ice sheet loss. Here we account for uncertainty in modeling these sources, along with climate forcing uncertainty.Ocean thermal expansion uncertainty is modeled using a probabilistic ensemble of climate models and climate forcing scenarios. The ensemble addresses model uniqueness and weights models and scenarios based on their ability to reproduce observed sea-level trends. Glacier sea-level rise is modeled by updating an existing glacier mass balance model with a probabilistic regional covariance model that addresses the scarcity of historical glacier observational data. This model is used to simulate glacier melt and associated patterns of sea-level rise. Ice sheet mass balance change is modeled using a kernel-density-based probabilistic ensemble of perturbed physics ice sheet models. The kernel-density model does not need to assume the shape of the ice sheet sampling space and rewards ice sheet models that reproduce observed ice sheet physics.As the computational cost of climate and ice sheet models can make probabilistic studies difficult, emulation methods are explored for estimating model outputs for forcing scenarios of interest. A nonlinear dual model for emulating climate model thermosteric and dynamic sea-level rise predictions is shown to outperform existing linear methods. Climate forcing is modeled using a probabilistic emissions rate growth model that addresses the impact of international climate agreements and estimates the relative likelihoods of forcing scenarios. Climate agreements have a large influence on the relative likelihoods of low mitigation forcing scenarios.Probabilistic sea-level rise hazard analysis is illustrated using a set of sea-level rise prediction models and forcing scenarios. Deaggregation of hazard analysis results show that ice sheet model projections and climate forcing dominate probabilistic sea-level rise hazard. Probabilistic hazard analysis is a step toward informing decision makers about how to mitigate and adapt to future sea-level rise

    Bayesian nonparametric inference in mechanistic models of complex biological systems

    Get PDF
    Parameter estimation in expensive computational models is a problem that commonly arises in science and engineering. With the increase in computational power, modellers started developing simulators of real life phenomena that are computationally intensive to evaluate. This, however, makes inference prohibitive due to the unit cost of a single function evaluation. This thesis focuses on computational models of biological and biomechanical processes such as the left-ventricular dynamics or the human pulmonary blood circulatory system. In the former model a single forward simulation is in the order of 11 minutes CPU time, while the latter takes approximately 23 seconds in our machines. Markov chain Monte Carlo methods or likelihood maximization using iterative algorithms would take days or weeks to provide a result. This makes them not suitable for clinical decision support systems, where a decision must be taken in a reasonable time frame. I discuss how to accelerate the inference by using the concept of emulation, i.e. by replacing a computationally expensive function with a statistical approximation based on a finite set of expensive training runs. The emulation target could be either the output-domain, representing the standard approach in the emulation literature, or the loss-domain, which is an alternative and different perspective. Then, I demonstrate how this approach can be used to estimate the parameters of expensive simulators. First I apply loss-emulation to a nonstandard variant of the Lotka-Volterra model of prey-predator interactions, in order to assess if the approach is approximately unbiased. Next, I present a comprehensive comparison between output-emulation and loss-emulation on a computational model of left ventricular dynamics, with the goal of inferring the constitutive law relating the myocardial stretch to its strain. This is especially relevant for assessing cardiac function post myocardial infarction. The results show how it is possible to estimate the stress-strain curve in just 15 minutes, compared to the one week required by the current best literature method. This means a reduction in the computational costs of 3 orders of magnitude. Next, I review Bayesian optimization (BO), an algorithm to optimize a computationally expensive function by adaptively improving the emulator. This method is especially useful in scenarios where the simulator is not considered to be a ``stable release''. For example, the simulator could still be undergoing further developments, bug fixing, and improvements. I develop a new framework based on BO to estimate the parameters of a partial differential equation (PDE) model of the human pulmonary blood circulation. The parameters, being related to the vessel structure and stiffness, represent important indicators of pulmonary hypertension risk, which need to be estimated as they can only be measured with invasive experiments. The results using simulated data show how it is possible to estimate a patient's vessel properties in a time frame suitable for clinical applications. I demonstrate a limitation of standard improvement-based acquisition functions for Bayesian optimization. The expected improvement (EI) policy recommends query points where the improvement is on average high. However, it does not account for the variance of the random variable Improvement. I define a new acquisition function, called ScaledEI, which recommends query points where the improvement on the incumbent minimum is expected to be high, with high confidence. This new BO algorithm is compared to acquisition functions from the literature on a large set of benchmark functions for global optimization, where it turns out to be a powerful default choice for Bayesian optimization. ScaledEI is then compared to standard non-Bayesian optimization solvers, to confirm that the policy still leads to a reduction in the number of forward simulations required to reach a given tolerance level on the function value. Finally, the new algorithm is applied to the problem of estimating the PDE parameters of the pulmonary circulation model previously discussed

    Remote sensing of phytoplankton biomass in oligotrophic and mesotrophic lakes: addressing estimation uncertainty through machine learning

    Get PDF
    Phytoplankton constitute the bottom of the aquatic food web, produce half of Earth’s oxygen and are part of the global carbon cycle. A measure of aquatic phytoplankton biomass therefore functions as a biological indicator of water status and quality. The abundance of phytoplankton in most lakes on Earth is low because they are weakly nourished (i.e., oligotrophic). It is practically infeasible to measure the millions of oligotrophic lakes on Earth through field sampling. Fortunately, phytoplankton universally contain the optically active pigment chlorophyll-a, which can be detected by optical sensors. Earth-orbiting satellite missions carry optical sensors that provide unparalleled high spatial coverage and temporal revisit frequency of lakes. However, when compared to waters with high nutrient loading (i.e., eutrophic), the remote sensing estimation of phytoplankton biomass in oligotrophic lakes is prone to high estimation uncertainties. Accurate retrieval of phytoplankton biomass is severely constrained by imperfect atmospheric correction, complicated inherent optical property (IOP) compositions, and limited model applicability. In order to address and reduce the current estimation uncertainties in phytoplankton remote sensing of low - moderate biomass lakes, machine learning is used in this thesis. In the first chapter the chlorophyll-a concentration (chla) estimation uncertainty from 13 chla algorithms is characterised. The uncertainty characterisation follows a two-step procedure: 1. estimation of chla from a representative dataset of field measurements and quantification of estimation uncertainty, 2. characterisation of chla estimation uncertainty. The results of this study show that estimation uncertainty across the dataset used in this chapter is high, whereby chla is both systematically under- and overestimated by the tested algorithms. Further, the characterisation reveals algorithm-specific causes of estimation uncertainty. The uncertainty sources for each of the tested algorithms are discussed and recommendations provided to improve the estimation capabilities. In the second chapter a novel machine learning algorithm for chla estimation is developed by combining Bayesian theory with Neural Networks (NNs). The resulting Bayesian Neural Networks (BNNs) are designed for the Ocean and Land Cover Instrument (OLCI) and MultiSpectral Imager (MSI) sensors aboard the Sentinel-3 and Sentinel-2 satellites, respectively. Unlike established chla algorithms, the BNNs provide a per-pixel uncertainty associated with estimated chla. Compared to reference chla algorithms, gains in chla estimation accuracy > 15% are achieved. Moreover, the quality of the provided BNN chla uncertainty is analysed. For most observations (> 75%) the BNN uncertainty estimate covers the reference in situ chla value, but the uncertainty calibration is not constantly accurate across several assessment strategies. The BNNs are applied to OLCI and MSI products to generate chla and uncertainty estimates in lakes from Africa, Canada, Europe and New Zealand. The BNN uncertainty estimate is furthermore used to deal with uncertainty introduced by prior atmospheric correction algorithms, adjacency affects and complex optical property compositions. The third chapter focuses on the estimation of lake biomass in terms of trophic status (TS). TS is conventionally estimated through chla. However, the remote sensing of chla, as shown in the two previous chapters, can be prone to high uncertainty. Therefore, in this chapter an algorithm for the direct classification of TS is designed. Instead of using a single algorithm for TS estimation, multiple individual algorithms are ensembled through stacking, whose estimates are evaluated by a higher-level meta-learner. The results of this ensemble scheme are compared to conventional switching of reference chla algorithms through optical water types (OWTs). The results show that estimation of TS is increased through direct classification rather than indirect estimation through chla. The designed meta-learning algorithm outperforms OWT switching of chla algorithms by 5-12%. Highest TS estimation accuracy is achieved for high biomass waters, whereas for low biomass waters extremely turbid waters produced high TS estimation uncertainty. Combining an ensemble of algorithms through a meta-learner represents a solution for the problem of algorithm selection across the large variation of global lake constituent concentrations and optical properties

    Inverse estimation for the simple earth system model ACC2 and its applications

    Get PDF
    The Aggregated Carbon Cycle, Atmospheric Chemistry, and Climate model (ACC2) (Tanaka and Kriegler et al., 2007a) describes physical-biogeochemical processes in the Earth system at a global-annual-mean level. Compared to its predecessors NICCS (Hooss, 2001) and ICM (Bruckner et al., 2003), ACC2 adopts more detailed parameterizations of atmospheric chemistry involving a set of agents (CO2, CH4, N2O, O3, SF6, 29 species of halocarbons, sulfate aerosols (direct effect), carbonaceous aerosols (direct effect), all aerosols (indirect effect), stratospheric H2O, OH, and pollutants NOx, CO, and VOC). In contrast to the Impulse Response Function (IRF) approaches in the predecessor models, ACC2 uses DOECLIM (Kriegler, 2005), a land-ocean Energy Balance Model (EBM), to calculate temperature change. The carbon cycle is described by box models based on the IRF approach. A temperature feedback is newly implemented to ocean and land CO2 uptake. The most novel aspect of ACC2 is its inverse estimation, the first attempt to estimate uncertain parameters simultaneously for the carbon cycle, atmospheric chemistry, and climate system by taking their interactions into account. Theoretical underpinning of the ACC2 inversion is the probabilistic inverse estimation theory (Tarantola, 2005), which characterizes the ACC2 inversion as an integration of the existing Earth system knowledge. This includes parameter estimates, observational databases, reconstructions, and physical-biogeochemical laws...thesi
    corecore