479 research outputs found

    Multilevel ensemble data assimilation

    Get PDF
    This thesis aims to investigate and improve the efficiency of ensemble transform methods for data assimilation, using an application of multilevel Monte Carlo. Multilevel Monte Carlo is an interesting framework to estimate statistics of discretized random variables, since it uses a hierarchy of discretizations with a refinement in resolution. This is in contrast to standard Monte Carlo estimators that only use a discretization at a fine resolution. A linear combination of sub-estimators, on different levels of this hierarchy, can provide new statistical estimators to random variables at the finest level of resolution with significantly greater efficiency than a standard Monte Carlo equivalent. Therefore, the extension to computing filtering estimators for data assimilation is a natural, but challenging area of study. These challenges arise due to the fact that correlation must be imparted between ensembles on adjacent levels of resolution and maintained during the assimilation of data. The methodology proposed in this thesis, considers coupling algorithms to establish this correlation. This generates multilevel estimators that significantly reduce the computational expense of propagating ensembles of discretizations through time and space, in between stages of data assimilation. An effective benchmark of this methodology is realised by filtering data into high-dimensional spatio-temporal systems, where a high computational complexity is required to solve the underlying partial differential equations. A novel extension of an ensemble transform localisation framework to finite element approximations within random spatio-temporal systems is proposed, in addition to a multilevel equivalent.Open Acces

    Mathematical and Algorithmic Aspects of Data Assimilation in the Geosciences

    Get PDF
    The field of “Data Assimilation” has been driven by applications from the geosciences where complex mathematical models are interfaced with observational data in order to improve model forecasts. Mathematically, data assimilation is closely related to filtering and smoothing on the one hand and inverse problems and statistical inference on the other. Key challenges of data assimilation arise from the high-dimensionality of the underlying models, combined with systematic spatio-temporal model errors, pure model uncertainty quantification and relatively sparse observation networks. Advances in the field of data assimilation will require combination of a broad range of mathematical techniques from differential equations, statistics, machine learning, probability, scientific computing and mathematical modeling, together with insights from practitioners in the field. The workshop brought together a collection of scientists representing this broad spectrum of research strands

    On the Calibration of Multilevel Monte Carlo Ensemble Forecasts

    Get PDF
    The multilevel Monte Carlo method can efficiently compute statistical estimates of discretized random variables for a given error tolerance. Traditionally, only a certain statistic is computed from a particular implementation of multilevel Monte Carlo. This article considers the multilevel case in which one wants to verify and evaluate a single ensemble that forms an empirical approximation to many different statistics, namely an ensemble forecast. We propose a simple algorithm that, in the univariate case, allows one to derive a statistically consistent single ensemble forecast from the hierarchy of ensembles that are formed during an implementation of multilevel Monte Carlo. This ensemble forecast then allows the entire multilevel hierarchy of ensembles to be evaluated using standard ensemble forecast verification techniques. We demonstrate the case of evaluating the calibration of the forecast

    Resilience for large ensemble computations

    Get PDF
    With the increasing power of supercomputers, ever more detailed models of physical systems can be simulated, and ever larger problem sizes can be considered for any kind of numerical system. During the last twenty years the performance of the fastest clusters went from the teraFLOPS domain (ASCI RED: 2.3 teraFLOPS) to the pre-exaFLOPS domain (Fugaku: 442 petaFLOPS), and we will soon have the first supercomputer with a peak performance cracking the exaFLOPS (El Capitan: 1.5 exaFLOPS). Ensemble techniques experience a renaissance with the availability of those extreme scales. Especially recent techniques, such as particle filters, will benefit from it. Current ensemble methods in climate science, such as ensemble Kalman filters, exhibit a linear dependency between the problem size and the ensemble size, while particle filters show an exponential dependency. Nevertheless, with the prospect of massive computing power come challenges such as power consumption and fault-tolerance. The mean-time-between-failures shrinks with the number of components in the system, and it is expected to have failures every few hours at exascale. In this thesis, we explore and develop techniques to protect large ensemble computations from failures. We present novel approaches in differential checkpointing, elastic recovery, fully asynchronous checkpointing, and checkpoint compression. Furthermore, we design and implement a fault-tolerant particle filter with pre-emptive particle prefetching and caching. And finally, we design and implement a framework for the automatic validation and application of lossy compression in ensemble data assimilation. Altogether, we present five contributions in this thesis, where the first two improve state-of-the-art checkpointing techniques, and the last three address the resilience of ensemble computations. The contributions represent stand-alone fault-tolerance techniques, however, they can also be used to improve the properties of each other. For instance, we utilize elastic recovery (2nd contribution) for mitigating resiliency in an online ensemble data assimilation framework (3rd contribution), and we built our validation framework (5th contribution) on top of our particle filter implementation (4th contribution). We further demonstrate that our contributions improve resilience and performance with experiments on various architectures such as Intel, IBM, and ARM processors.Amb l’increment de les capacitats de còmput dels supercomputadors, es poden simular models de sistemes físics encara més detallats, i es poden resoldre problemes de més grandària en qualsevol tipus de sistema numèric. Durant els últims vint anys, el rendiment dels clústers més ràpids ha passat del domini dels teraFLOPS (ASCI RED: 2.3 teraFLOPS) al domini dels pre-exaFLOPS (Fugaku: 442 petaFLOPS), i aviat tindrem el primer supercomputador amb un rendiment màxim que sobrepassa els exaFLOPS (El Capitan: 1.5 exaFLOPS). Les tècniques d’ensemble experimenten un renaixement amb la disponibilitat d’aquestes escales tan extremes. Especialment les tècniques més noves, com els filtres de partícules, se¿n beneficiaran. Els mètodes d’ensemble actuals en climatologia, com els filtres d’ensemble de Kalman, exhibeixen una dependència lineal entre la mida del problema i la mida de l’ensemble, mentre que els filtres de partícules mostren una dependència exponencial. No obstant, juntament amb les oportunitats de poder computar massivament, apareixen desafiaments com l’alt consum energètic i la necessitat de tolerància a errors. El temps de mitjana entre errors es redueix amb el nombre de components del sistema, i s’espera que els errors s’esdevinguin cada poques hores a exaescala. En aquesta tesis, explorem i desenvolupem tècniques per protegir grans càlculs d’ensemble d’errors. Presentem noves tècniques en punts de control diferencials, recuperació elàstica, punts de control totalment asincrònics i compressió de punts de control. A més, dissenyem i implementem un filtre de partícules tolerant a errors amb captació i emmagatzematge en caché de partícules de manera preventiva. I finalment, dissenyem i implementem un marc per la validació automàtica i l’aplicació de compressió amb pèrdua en l’assimilació de dades d’ensemble. En total, en aquesta tesis presentem cinc contribucions, les dues primeres de les quals milloren les tècniques de punts de control més avançades, mentre que les tres restants aborden la resiliència dels càlculs d’ensemble. Les contribucions representen tècniques independents de tolerància a errors; no obstant, també es poden utilitzar per a millorar les propietats de cadascuna. Per exemple, utilitzem la recuperació elàstica (segona contribució) per a mitigar la resiliència en un marc d’assimilació de dades d’ensemble en línia (tercera contribució), i construïm el nostre marc de validació (cinquena contribució) sobre la nostra implementació del filtre de partícules (quarta contribució). A més, demostrem que les nostres contribucions milloren la resiliència i el rendiment amb experiments en diverses arquitectures, com processadors Intel, IBM i ARM.Postprint (published version

    A Review on Artificial Intelligence Applications for Grid-Connected Solar Photovoltaic Systems

    Get PDF
    The use of artificial intelligence (AI) is increasing in various sectors of photovoltaic (PV) systems, due to the increasing computational power, tools and data generation. The currently employed methods for various functions of the solar PV industry related to design, forecasting, control, and maintenance have been found to deliver relatively inaccurate results. Further, the use of AI to perform these tasks achieved a higher degree of accuracy and precision and is now a highly interesting topic. In this context, this paper aims to investigate how AI techniques impact the PV value chain. The investigation consists of mapping the currently available AI technologies, identifying possible future uses of AI, and also quantifying their advantages and disadvantages in regard to the conventional mechanisms

    Data Mining Applications to Fault Diagnosis in Power Electronic Systems: A Systematic Review

    Get PDF

    TopologyNet: Topology based deep convolutional neural networks for biomolecular property predictions

    Full text link
    Although deep learning approaches have had tremendous success in image, video and audio processing, computer vision, and speech recognition, their applications to three-dimensional (3D) biomolecular structural data sets have been hindered by the entangled geometric complexity and biological complexity. We introduce topology, i.e., element specific persistent homology (ESPH), to untangle geometric complexity and biological complexity. ESPH represents 3D complex geometry by one-dimensional (1D) topological invariants and retains crucial biological information via a multichannel image representation. It is able to reveal hidden structure-function relationships in biomolecules. We further integrate ESPH and convolutional neural networks to construct a multichannel topological neural network (TopologyNet) for the predictions of protein-ligand binding affinities and protein stability changes upon mutation. To overcome the limitations to deep learning arising from small and noisy training sets, we present a multitask topological convolutional neural network (MT-TCNN). We demonstrate that the present TopologyNet architectures outperform other state-of-the-art methods in the predictions of protein-ligand binding affinities, globular protein mutation impacts, and membrane protein mutation impacts.Comment: 20 pages, 8 figures, 5 table

    Assessing erosion and flood risk in the coastal zone through the application of multilevel Monte Carlo methods

    Get PDF
    Coastal zones are vulnerable to both erosion and flood risk, which can be assessed using coupled hydro- morphodynamic models. However, the use of such models as decision support tools suffers from a high degree of uncertainty, due to both incomplete knowledge and natural variability in the system. In this work, we show for the first time how the multilevel Monte Carlo method (MLMC) can be applied in hydro-morphodynamic coastal ocean modelling, here using the popular model XBeach, to quantify uncertainty by computing statistics of key output variables given uncertain input parameters. MLMC accelerates the Monte Carlo approach through the use of a hierarchy of models with different levels of resolution. Several theoretical and real-world coastal zone case studies are considered here, for which output variables that are key to the assessment of flood and erosion risk, such as wave run-up height and total eroded volume, are estimated. We show that MLMC can significantly reduce computational cost, resulting in speed up factors of 40 or greater compared to a standard Monte Carlo approach, whilst keeping the same level of accuracy. Furthermore, a sophisticated ensemble generating technique is used to estimate the cumulative distribution of output variables from the MLMC output. This allows for the probability of a variable exceeding a certain value to be estimated, such as the probability of a wave run-up height exceeding the height of a seawall. This is a valuable capability that can be used to inform decision-making under uncertaint
    corecore