79 research outputs found
A test case for application of convolutional neural networks to spatio-temporal climate data: Re-identifying clustered weather patterns
Convolutional neural networks (CNNs) can potentially provide powerful tools
for classifying and identifying patterns in climate and environmental data.
However, because of the inherent complexities of such data, which are often
spatio-temporal, chaotic, and non-stationary, the CNN algorithms must be
designed/evaluated for each specific dataset and application. Yet to start,
CNN, a supervised technique, requires a large labeled dataset. Labeling demands
(human) expert time, which combined with the limited number of relevant
examples in this area, can discourage using CNNs for new problems. To address
these challenges, here we (1) Propose an effective auto-labeling strategy based
on using an unsupervised clustering algorithm and evaluating the performance of
CNNs in re-identifying these clusters; (2) Use this approach to label thousands
of daily large-scale weather patterns over North America in the outputs of a
fully-coupled climate model and show the capabilities of CNNs in re-identifying
the 4 clustered regimes. The deep CNN trained with samples or more per
cluster has an accuracy of or better. Accuracy scales monotonically but
nonlinearly with the size of the training set, e.g. reaching with
training samples per cluster. Effects of architecture and hyperparameters on
the performance of CNNs are examined and discussed
Data-driven super-parameterization using deep learning: Experimentation with multi-scale Lorenz 96 systems and transfer-learning
To make weather/climate modeling computationally affordable, small-scale
processes are usually represented in terms of the large-scale,
explicitly-resolved processes using physics-based or semi-empirical
parameterization schemes. Another approach, computationally more demanding but
often more accurate, is super-parameterization (SP), which involves integrating
the equations of small-scale processes on high-resolution grids embedded within
the low-resolution grids of large-scale processes. Recently, studies have used
machine learning (ML) to develop data-driven parameterization (DD-P) schemes.
Here, we propose a new approach, data-driven SP (DD-SP), in which the equations
of the small-scale processes are integrated data-drivenly using ML methods such
as recurrent neural networks. Employing multi-scale Lorenz 96 systems as
testbed, we compare the cost and accuracy (in terms of both short-term
prediction and long-term statistics) of parameterized low-resolution (LR), SP,
DD-P, and DD-SP models. We show that with the same computational cost, DD-SP
substantially outperforms LR, and is better than DD-P, particularly when scale
separation is lacking. DD-SP is much cheaper than SP, yet its accuracy is the
same in reproducing long-term statistics and often comparable in short-term
forecasting. We also investigate generalization, finding that when models
trained on data from one system are applied to a system with different forcing
(e.g., more chaotic), the models often do not generalize, particularly when the
short-term prediction accuracy is examined. But we show that transfer-learning,
which involves re-training the data-driven model with a small amount of data
from the new system, significantly improves generalization. Potential
applications of DD-SP and transfer-learning in climate/weather modeling and the
expected challenges are discussed
Data-driven prediction of a multi-scale Lorenz 96 chaotic system using deep learning methods: Reservoir computing, ANN, and RNN-LSTM
In this paper, the performance of three deep learning methods for predicting
short-term evolution and for reproducing the long-term statistics of a
multi-scale spatio-temporal Lorenz 96 system is examined. The methods are: echo
state network (a type of reservoir computing, RC-ESN), deep feed-forward
artificial neural network (ANN), and recurrent neural network with long
short-term memory (RNN-LSTM). This Lorenz 96 system has three tiers of
nonlinearly interacting variables representing slow/large-scale (),
intermediate (), and fast/small-scale () processes. For training or
testing, only is available; and are never known or used. We show
that RC-ESN substantially outperforms ANN and RNN-LSTM for short-term
prediction, e.g., accurately forecasting the chaotic trajectories for hundreds
of numerical solver's time steps, equivalent to several Lyapunov timescales.
The RNN-LSTM and ANN show some prediction skills as well; RNN-LSTM bests ANN.
Furthermore, even after losing the trajectory, data predicted by RC-ESN and
RNN-LSTM have probability density functions (PDFs) that closely match the true
PDF, even at the tails. The PDF of the data predicted using ANN, however,
deviates from the true PDF. Implications, caveats, and applications to
data-driven and data-assisted surrogate modeling of complex nonlinear dynamical
systems such as weather/climate are discussed.Comment: Some changes, in Figures, addition of an appendix etc has been don
Interpretable structural model error discovery from sparse assimilation increments using spectral bias-reduced neural networks: A quasi-geostrophic turbulence test case
Earth system models suffer from various structural and parametric errors in
their representation of nonlinear, multi-scale processes, leading to
uncertainties in their long-term projections. The effects of many of these
errors (particularly those due to fast physics) can be quantified in short-term
simulations, e.g., as differences between the predicted and observed states
(analysis increments). With the increase in the availability of high-quality
observations and simulations, learning nudging from these increments to correct
model errors has become an active research area. However, most studies focus on
using neural networks, which while powerful, are hard to interpret, are
data-hungry, and poorly generalize out-of-distribution. Here, we show the
capabilities of Model Error Discovery with Interpretability and Data
Assimilation (MEDIDA), a general, data-efficient framework that uses
sparsity-promoting equation-discovery techniques to learn model errors from
analysis increments. Using two-layer quasi-geostrophic turbulence as the test
case, MEDIDA is shown to successfully discover various linear and nonlinear
structural/parametric errors when full observations are available. Discovery
from spatially sparse observations is found to require highly accurate
interpolation schemes. While NNs have shown success as interpolators in recent
studies, here, they are found inadequate due to their inability to accurately
represent small scales, a phenomenon known as spectral bias. We show that a
general remedy, adding a random Fourier feature layer to the NN, resolves this
issue enabling MEDIDA to successfully discover model errors from sparse
observations. These promising results suggest that with further development,
MEDIDA could be scaled up to models of the Earth system and real observations.Comment: 26 pages, 5+1 figure
Learning physics-constrained subgrid-scale closures in the small-data regime for stable and accurate LES
We demonstrate how incorporating physics constraints into convolutional
neural networks (CNNs) enables learning subgrid-scale (SGS) closures for stable
and accurate large-eddy simulations (LES) in the small-data regime (i.e., when
the availability of high-quality training data is limited). Using several
setups of forced 2D turbulence as the testbeds, we examine the {\it a priori}
and {\it a posteriori} performance of three methods for incorporating physics:
1) data augmentation (DA), 2) CNN with group convolutions (GCNN), and 3) loss
functions that enforce a global enstrophy-transfer conservation (EnsCon). While
the data-driven closures from physics-agnostic CNNs trained in the big-data
regime are accurate and stable, and outperform dynamic Smagorinsky (DSMAG)
closures, their performance substantially deteriorate when these CNNs are
trained with 40x fewer samples (the small-data regime). We show that CNN with
DA and GCNN address this issue and each produce accurate and stable data-driven
closures in the small-data regime. Despite its simplicity, DA, which adds
appropriately rotated samples to the training set, performs as well or in some
cases even better than GCNN, which uses a sophisticated equivariance-preserving
architecture. EnsCon, which combines structural modeling with aspect of
functional modeling, also produces accurate and stable closures in the
small-data regime. Overall, GCNN+EnCon, which combines these two physics
constraints, shows the best {\it a posteriori} performance in this regime.
These results illustrate the power of physics-constrained learning in the
small-data regime for accurate and stable LES.Comment: 23 pages, 9 figure
OceanNet: a principled neural operator-based digital twin for regional oceans.
While data-driven approaches demonstrate great potential in atmospheric modeling and weather forecasting, ocean modeling poses distinct challenges due to complex bathymetry, land, vertical structure, and flow non-linearity. This study introduces OceanNet, a principled neural operator-based digital twin for regional sea-suface height emulation. OceanNet uses a Fourier neural operator and predictor-evaluate-corrector integration scheme to mitigate autoregressive error growth and enhance stability over extended time scales. A spectral regularizer counteracts spectral bias at smaller scales. OceanNet is applied to the northwest Atlantic Ocean western boundary current (the Gulf Stream), focusing on the task of seasonal prediction for Loop Current eddies and the Gulf Stream meander. Trained using historical sea surface height (SSH) data, OceanNet demonstrates competitive forecast skill compared to a state-of-the-art dynamical ocean model forecast, reducing computation by 500,000 times. These accomplishments demonstrate initial steps for physics-inspired deep neural operators as cost-effective alternatives to high-resolution numerical ocean models
- …