5 research outputs found

    Using Machine Learning for Model Physics: an Overview

    Full text link
    In the overview, a generic mathematical object (mapping) is introduced, and its relation to model physics parameterization is explained. Machine learning (ML) tools that can be used to emulate and/or approximate mappings are introduced. Applications of ML to emulate existing parameterizations, to develop new parameterizations, to ensure physical constraints, and control the accuracy of developed applications are described. Some ML approaches that allow developers to go beyond the standard parameterization paradigm are discussed.Comment: 50 pages, 3 figures, 1 tabl

    APPLICATION OF NEURAL NETWORKS TO EMULATION OF RADIATION PARAMETERIZATIONS IN GENERAL CIRCULATION MODELS

    Get PDF
    A novel approach based on using neural network (NN) techniques for approximation of physical components of complex environmental systems has been applied and further developed in this dissertation. A new type of a numerical model, a complex hybrid environmental model, based on a combination of deterministic and statistical learning model components, has been explored. Conceptual and practical aspects of developing hybrid models have been formalized as a methodology for applications to climate modeling and numerical weather prediction. The approach uses NN as a machine or statistical learning technique to develop highly accurate and fast emulations for model physics components/parameterizations. The NN emulations of the most time consuming model physics components, short and long wave radiation (LWR and SWR) parameterizations have been combined with the remaining deterministic components of a general circulation model (GCM) to constitute a hybrid GCM (HGCM). The parallel GCM and HGCM simulations produce very similar results but HGCM is significantly faster. The high accuracy, which is of a paramount importance for the approach, and a speed-up of model calculations when using NN emulations, open the opportunity for model improvement. It includes using extended NN ensembles and/or more frequent calculations of full model radiation resulting in an improvement of radiation-cloud interaction, a better consistency with model dynamics and other model physics components. First, the approach was successfully applied to a moderate resolution (T42L26) uncoupled NCAR Community Atmospheric Model driven by climatological SST for a decadal climate simulation mode. Then it has been further developed and subsequently implemented into a coupled GCM, the NCEP Climate Forecast System with significantly higher resolution (T126L64) and time dependent CO2 and tested for decadal climate simulations, seasonal prediction, and short- to medium term forecasts. The developed highly accurate NN emulations of radiation parameterizations are on average one to two orders of magnitude faster than the original radiation parameterizations. The NN approach was extended by introduction of NN ensembles and a compound parameterization with quality control of larger errors. Applicability of other statistical learning techniques, such as approximate nearest neighbor approximation and random trees, to emulation of model physics has also been explore

    Ensemble Learning in the Presence of Noise

    Full text link
    Learning in the presence of noise is an important issue in machine learning. The design and implementation of e ective strategies for automatic induction from noisy data is particularly important in real-world problems, where noise from defective collecting processes, data contamination or intrinsic uctuations is ubiquitous. There are two general strategies to address this problem. One is to design a robust learning method. Another one is to identify noisy instances and eliminate or correct them. In this thesis we propose to use ensembles to mitigate the negative impact of mislabelled data in the learning process. In ensemble learning the predictions of individual learners are combined to obtain a nal decision. E ective combinations take advantage of the complementarity of these base learners. In this manner the errors incurred by a learner can be compensated by the predictions of other learners in the combination. A rst contribution of this work is the use of subsampling to build bootstrap ensembles, such as bagging and random forest, that are resilient to class label noise. By using lower sampling rates, the detrimental e ect of mislabelled examples on the nal ensemble decisions can be tempered. The reason is that each labelled instance is present in a smaller fraction of the training sets used to build individual learners. Ensembles can also be used as a noise detection procedure to improve the quality of the data used for training. In this strategy, one attempts to identify noisy instances and either correct (by switching their class label) or discard them. A particular example is identi ed as noise if a speci ed percentage (greater than 50%) of the learners disagree with the given label for this example. Using an extensive empirical evaluation we demonstrate the use of subsampling as an e ective tool to detect and handle noise in classi cation problems
    corecore