10 research outputs found
An Unsupervised Learning Perspective on the Dynamic Contribution to Extreme Precipitation Changes
Despite the importance of quantifying how the spatial patterns of extreme
precipitation will change with warming, we lack tools to objectively analyze
the storm-scale outputs of modern climate models. To address this gap, we
develop an unsupervised machine learning framework to quantify how storm
dynamics affect precipitation extremes and their changes without sacrificing
spatial information. Over a wide range of precipitation quantiles, we find that
the spatial patterns of extreme precipitation changes are dominated by spatial
shifts in storm regimes rather than intrinsic changes in how these storm
regimes produce precipitation.Comment: 14 Pages, 9 Figures, Accepted to "Tackling Climate Change with
Machine Learning: workshop at NeurIPS 2022". arXiv admin note: text overlap
with arXiv:2208.1184
Comparing Storm Resolving Models and Climates via Unsupervised Machine Learning
Storm-resolving models (SRMs) have gained widespread interest because of the
unprecedented detail with which they resolve the global climate. However, it
remains difficult to quantify objective differences in how SRMs resolve complex
atmospheric formations. This lack of appropriate tools for comparing model
similarities is a problem in many disparate fields that involve simulation
tools for complex data. To address this challenge we develop methods to
estimate distributional distances based on both nonlinear dimensionality
reduction and vector quantization. Our approach automatically learns
appropriate notions of similarity from low-dimensional latent data
representations that the different models produce. This enables an
intercomparison of nine SRMs based on their high-dimensional simulation data
and reveals that only six are similar in their representation of atmospheric
dynamics. Furthermore, we uncover signatures of the convective response to
global warming in a fully unsupervised way. Our study provides a path toward
evaluating future high-resolution simulation data more objectively.Comment: 22 pages, 19 figures. Submitted to journal for consideratio
ClimSim: A large multi-scale dataset for hybrid physics-ML climate emulation
Modern climate projections lack adequate spatial and temporal resolution due to computational constraints. A consequence is inaccurate and imprecise predictions of critical processes such as storms. Hybrid methods that combine physics with machine learning (ML) have introduced a new generation of higher fidelity climate simulators that can sidestep Moore's Law by outsourcing compute-hungry, short, high-resolution simulations to ML emulators. However, this hybrid ML-physics simulation approach requires domain-specific treatment and has been inaccessible to ML experts because of lack of training data and relevant, easy-to-use workflows. We present ClimSim, the largest-ever dataset designed for hybrid ML-physics research. It comprises multi-scale climate simulations, developed by a consortium of climate scientists and ML researchers. It consists of 5.7 billion pairs of multivariate input and output vectors that isolate the influence of locally-nested, high-resolution, high-fidelity physics on a host climate simulator's macro-scale physical state.The dataset is global in coverage, spans multiple years at high sampling frequency, and is designed such that resulting emulators are compatible with downstream coupling into operational climate simulators. We implement a range of deterministic and stochastic regression baselines to highlight the ML challenges and their scoring. The data (https://huggingface.co/datasets/LEAP/ClimSim_high-res) and code (https://leap-stc.github.io/ClimSim) are released openly to support the development of hybrid ML-physics and high-fidelity climate simulations for the benefit of science and society
Recommended from our members
Improving The Modeling and Analysis of Tropical Convection and Precipitation through Machine Learning Methods
Our knowledge of the atmosphere has increased immensely in the last few decades because of high-resolution "storm-resolving" climate models. With these models, we can simulate atmospheric processes including deep, moist convection with detail previously not possible giving us a more accurate representation of storms, precipitation, and atmospheric waves. However, limits continue to constrain our understanding of the dynamics of the atmosphere. We presently lack the ability to run these new storm-resolving models (SRMs) for the durations we need to understand the cloud-climate feedback. Meanwhile, running these SRMs for any amount of time produces very large volumes of data which are difficult to analyze properly. This work leverages disparate machine-learning approaches in an attempt to break through these deadlocks. First, we implement feed-forward neural networks to replace the computationally expensive explicit convection calculations within the "Super-parameterized Community Atmospheric Model" (SPCAM) allowing us to run the model at a fraction of the original computational cost but with the same accuracy even when realistic geographic boundary conditions are included. Second, we use deep generative models to analyze and organize SPCAM output. This allows us to identify unique types of convection as well as convective storm anomalies within the data. A third outcome involves expanding on this unsupervised learning work to compare different SRMs - including uniform resolution global cloud resolving models - and quantify which have similar representations of the dynamics of the atmosphere. We find that even among high-resolution SRMs there are substantial differences in the type, proportion, and intensity of convection in representations of atmospheric dynamics. Fourth, we leverage these deep, generative machine learning models to make a novel metric of climate change and use it to better understand the physical mechanisms driving changes in extreme precipitation. We capture anticipated signals of global warming with minimal human intervention while showing the importance of the convection regime type to controlling the changing spatial patterns of heavy rainfall
Recommended from our members
Comparing storm resolving models and climates via unsupervised machine learning
Global storm-resolving models (GSRMs) have gained widespread interest because of the unprecedented detail with which they resolve the global climate. However, it remains difficult to quantify objective differences in how GSRMs resolve complex atmospheric formations. This lack of comprehensive tools for comparing model similarities is a problem in many disparate fields that involve simulation tools for complex data. To address this challenge we develop methods to estimate distributional distances based on both nonlinear dimensionality reduction and vector quantization. Our approach automatically learns physically meaningful notions of similarity from low-dimensional latent data representations that the different models produce. This enables an intercomparison of nine GSRMs based on their high-dimensional simulation data (2D vertical velocity snapshots) and reveals that only six are similar in their representation of atmospheric dynamics. Furthermore, we uncover signatures of the convective response to global warming in a fully unsupervised way. Our study provides a path toward evaluating future high-resolution simulation data more objectively
ClimSim: An open large-scale dataset for training high-resolution physics emulators in hybrid multi-scale climate simulators
Modern climate projections lack adequate spatial and temporal resolution due to computational constraints. A consequence is inaccurate and imprecise predictions of critical processes such as storms. Hybrid methods that combine physics with machine learning (ML) have introduced a new generation of higher fidelity climate simulators that can sidestep Moore's Law by outsourcing compute-hungry, short, high-resolution simulations to ML emulators. However, this hybrid ML-physics simulation approach requires domain-specific treatment and has been inaccessible to ML experts because of lack of training data and relevant, easy-to-use workflows. We present ClimSim, the largest-ever dataset designed for hybrid ML-physics research. It comprises multi-scale climate simulations, developed by a consortium of climate scientists and ML researchers. It consists of 5.7 billion pairs of multivariate input and output vectors that isolate the influence of locally-nested, high-resolution, high-fidelity physics on a host climate simulator's macro-scale physical state. The dataset is global in coverage, spans multiple years at high sampling frequency, and is designed such that resulting emulators are compatible with downstream coupling into operational climate simulators. We implement a range of deterministic and stochastic regression baselines to highlight the ML challenges and their scoring. The data (https://huggingface.co/datasets/LEAP/ClimSim_high-res, https://huggingface.co/datasets/LEAP/ClimSim_low-res, and https://huggingface.co/datasets/LEAP/ClimSim_low-res_aqua-planet) and code (https://leap-stc.github.io/ClimSim) are released openly to support the development of hybrid ML-physics and high-fidelity climate simulations for the benefit of science and society