18 research outputs found
Deep learning to represent sub-grid processes in climate models
The representation of nonlinear sub-grid processes, especially clouds, has
been a major source of uncertainty in climate models for decades.
Cloud-resolving models better represent many of these processes and can now be
run globally but only for short-term simulations of at most a few years because
of computational limitations. Here we demonstrate that deep learning can be
used to capture many advantages of cloud-resolving modeling at a fraction of
the computational cost. We train a deep neural network to represent all
atmospheric sub-grid processes in a climate model by learning from a
multi-scale model in which convection is treated explicitly. The trained neural
network then replaces the traditional sub-grid parameterizations in a global
general circulation model in which it freely interacts with the resolved
dynamics and the surface-flux scheme. The prognostic multi-year simulations are
stable and closely reproduce not only the mean climate of the cloud-resolving
simulation but also key aspects of variability, including precipitation
extremes and the equatorial wave spectrum. Furthermore, the neural network
approximately conserves energy despite not being explicitly instructed to.
Finally, we show that the neural network parameterization generalizes to new
surface forcing patterns but struggles to cope with temperatures far outside
its training manifold. Our results show the feasibility of using deep learning
for climate model parameterization. In a broader context, we anticipate that
data-driven Earth System Model development could play a key role in reducing
climate prediction uncertainty in the coming decade.Comment: View official PNAS version at https://doi.org/10.1073/pnas.181028611
Recommended from our members
Improving Precipitation Estimation Using Convolutional Neural Network
Precipitation process is generally considered to be poorly represented in numerical weather/climate models. Statistical downscaling (SD) methods, which relate precipitation with model resolved dynamics, often provide more accurate precipitation estimates compared to model's raw precipitation products. We introduce the convolutional neural network model to foster this aspect of SD for daily precipitation prediction. Specifically, we restrict the predictors to the variables that are directly resolved by discretizing the atmospheric dynamics equations. In this sense, our model works as an alternative to the existing precipitation-related parameterization schemes for numerical precipitation estimation. We train the model to learn precipitation-related dynamical features from the surrounding dynamical fields by optimizing a hierarchical set of spatial convolution kernels. We test the model at 14 geogrid points across the contiguous United States. Results show that provided with enough data, precipitation estimates from the convolutional neural network model outperform the reanalysis precipitation products, as well as SD products using linear regression, nearest neighbor, random forest, or fully connected deep neural network. Evaluation for the test set suggests that the improvements can be seamlessly transferred to numerical weather modeling for improving precipitation prediction. Based on the default network, we examine the impact of the network architectures on model performance. Also, we offer simple visualization and analyzing approaches to interpret the models and their results. Our study contributes to the following two aspects: First, we offer a novel approach to enhance numerical precipitation estimation; second, the proposed model provides important implications for improving precipitation-related parameterization schemes using a data-driven approach
A Container-Based Workflow for Distributed Training of Deep Learning Algorithms in HPC Clusters
Deep learning has been postulated as a solution for numerous problems in
different branches of science. Given the resource-intensive nature of these
models, they often need to be executed on specialized hardware such graphical
processing units (GPUs) in a distributed manner. In the academic field,
researchers get access to this kind of resources through High Performance
Computing (HPC) clusters. This kind of infrastructures make the training of
these models difficult due to their multi-user nature and limited user
permission. In addition, different HPC clusters may possess different
peculiarities that can entangle the research cycle (e.g., libraries
dependencies). In this paper we develop a workflow and methodology for the
distributed training of deep learning models in HPC clusters which provides
researchers with a series of novel advantages. It relies on udocker as
containerization tool and on Horovod as library for the distribution of the
models across multiple GPUs. udocker does not need any special permission,
allowing researchers to run the entire workflow without relying on any
administrator. Horovod ensures the efficient distribution of the training
independently of the deep learning framework used. Additionally, due to
containerization and specific features of the workflow, it provides researchers
with a cluster-agnostic way of running their models. The experiments carried
out show that the workflow offers good scalability in the distributed training
of the models and that it easily adapts to different clusters.Comment: Under review for Cluster Computin
A container-based workflow for distributed training of deep learning algorithms in HPC clusters
Deep learning has been postulated as a solution for numerous problems in different branches of science. Given the resource-intensive nature of these models, they often need to be executed on specialized hardware such graphical processing units (GPUs) in a distributed manner. In the academic field, researchers get access to this kind of resources through High Performance Computing (HPC) clusters. This kind of infrastructures make the training of these models difficult due to their multi-user nature and limited user permission. In addition, different HPC clusters may possess different peculiarities that can entangle the research cycle (e.g., libraries dependencies). In this paper we develop a workflow and methodology for the distributed training of deep learning models in HPC clusters which provides researchers with a series of novel advantages. It relies on udocker as containerization tool and on Horovod as library for the distribution of the models across multiple GPUs. udocker does not need any special permission, allowing researchers to run the entire workflow without relying on any administrator. Horovod ensures the efficient distribution of the training independently of the deep learning framework used. Additionally, due to containerization and specific features of the workflow, it provides researchers with a cluster-agnostic way of running their models. The experiments carried out show that the workflow offers good scalability in the distributed training of the models and that it easily adapts to different clusters
Machine Learning for Stochastic Parameterization: Generative Adversarial Networks in the Lorenz '96 Model
Stochastic parameterizations account for uncertainty in the representation of
unresolved sub-grid processes by sampling from the distribution of possible
sub-grid forcings. Some existing stochastic parameterizations utilize
data-driven approaches to characterize uncertainty, but these approaches
require significant structural assumptions that can limit their scalability.
Machine learning models, including neural networks, are able to represent a
wide range of distributions and build optimized mappings between a large number
of inputs and sub-grid forcings. Recent research on machine learning
parameterizations has focused only on deterministic parameterizations. In this
study, we develop a stochastic parameterization using the generative
adversarial network (GAN) machine learning framework. The GAN stochastic
parameterization is trained and evaluated on output from the Lorenz '96 model,
which is a common baseline model for evaluating both parameterization and data
assimilation techniques. We evaluate different ways of characterizing the input
noise for the model and perform model runs with the GAN parameterization at
weather and climate timescales. Some of the GAN configurations perform better
than a baseline bespoke parameterization at both timescales, and the networks
closely reproduce the spatio-temporal correlations and regimes of the Lorenz
'96 system. We also find that in general those models which produce skillful
forecasts are also associated with the best climate simulations.Comment: Submitted to Journal of Advances in Modeling Earth Systems (JAMES
Assessing the Potential of Deep Learning for Emulating Cloud Superparameterization in Climate Models with Real-Geography Boundary Conditions
We explore the potential of feed-forward deep neural networks (DNNs) for
emulating cloud superparameterization in realistic geography, using offline
fits to data from the Super Parameterized Community Atmospheric Model. To
identify the network architecture of greatest skill, we formally optimize
hyperparameters using ~250 trials. Our DNN explains over 70 percent of the
temporal variance at the 15-minute sampling scale throughout the mid-to-upper
troposphere. Autocorrelation timescale analysis compared against DNN skill
suggests the less good fit in the tropical, marine boundary layer is driven by
neural network difficulty emulating fast, stochastic signals in convection.
However, spectral analysis in the temporal domain indicates skillful emulation
of signals on diurnal to synoptic scales. A close look at the diurnal cycle
reveals correct emulation of land-sea contrasts and vertical structure in the
heating and moistening fields, but some distortion of precipitation.
Sensitivity tests targeting precipitation skill reveal complementary effects of
adding positive constraints vs. hyperparameter tuning, motivating the use of
both in the future. A first attempt to force an offline land model with DNN
emulated atmospheric fields produces reassuring results further supporting
neural network emulation viability in real-geography settings. Overall, the fit
skill is competitive with recent attempts by sophisticated Residual and
Convolutional Neural Network architectures trained on added information,
including memory of past states. Our results confirm the parameterizability of
superparameterized convection with continents through machine learning and we
highlight advantages of casting this problem locally in space and time for
accurate emulation and hopefully quick implementation of hybrid climate models.Comment: 32 Pages, 13 Figures, Revised Version Submitted to Journal of
Advances in Modeling Earth Systems April 202