Search CORE

62 research outputs found

Coupling Data Science Techniques and Numerical Weather Prediction Models for High-Impact Weather Prediction

Author: Gagne David John II
Publication venue
Publication date: 18/08/2016
Field of study

Meteorologists have access to more model guidance and observations than ever before, but this additional information does not necessarily lead to better forecasts. New tools are needed to reduce the cognitive load on forecasters and to provide them with accurate, reliable consensus guidance. Techniques from the data science community, such as machine learning and image processing, have the potential to summarize and calibrate numerical weather prediction model output and to generate deterministic and probabilistic forecasts of high-impact weather. In this dissertation, I developed data-science-based approaches to improve the predictions of two high-impact weather domains: hail and solar irradiance. Both hail and solar irradiance produce large economic impacts, have non-Gaussian distributions of occurrence, are poorly observed, and are partially driven by processes too small to be resolved by numerical weather prediction models. Hail forecasts were produced with convection-allowing model output from the Center for Analysis and Prediction of Storms and National Center for Atmospheric Research ensembles. The machine learning hail forecasts were compared against storm surrogate variables and physics-based diagnostic models of hail size. Initial machine learning hail forecasts reduced size errors but struggled with predicting extreme events. By coupling the machine learning model to predicting hail size distributions and estimating the distribution parameters jointly, the machine learning methods were able to show skill and reliability in predicting both severe and significant hail. Machine learning model and data configurations for gridded solar irradiance forecasting were evaluated on two numerical modeling systems. The evaluation determined how machine learning model choice, closeness of fit to training data, training data aggregation, and interpolation method affected forecasts of clearness index at Oklahoma Mesonet sites not included in the training data. The choice of machine learning model, interpolation scheme, and loss function had the biggest impacts on performance. Errors tended to be lower at testing sites with sunnier weather and those that were closer to training sites. All of the machine learning methods produced reliable predictions but underestimated the frequency of cloudiness compared to observations

SHAREOK repository

Generative ensemble deep learning severe weather prediction from a deterministic convection-allowing model

Author: Gagne II David John
Sha Yingkai
Sobash Ryan A.
Publication venue
Publication date: 09/10/2023
Field of study

An ensemble post-processing method is developed for the probabilistic prediction of severe weather (tornadoes, hail, and wind gusts) over the conterminous United States (CONUS). The method combines conditional generative adversarial networks (CGANs), a type of deep generative model, with a convolutional neural network (CNN) to post-process convection-allowing model (CAM) forecasts. The CGANs are designed to create synthetic ensemble members from deterministic CAM forecasts, and their outputs are processed by the CNN to estimate the probability of severe weather. The method is tested using High-Resolution Rapid Refresh (HRRR) 1--24 hr forecasts as inputs and Storm Prediction Center (SPC) severe weather reports as targets. The method produced skillful predictions with up to 20% Brier Skill Score (BSS) increases compared to other neural-network-based reference methods using a testing dataset of HRRR forecasts in 2021. For the evaluation of uncertainty quantification, the method is overconfident but produces meaningful ensemble spreads that can distinguish good and bad forecasts. The quality of CGAN outputs is also evaluated. Results show that the CGAN outputs behave similarly to a numerical ensemble; they preserved the inter-variable correlations and the contribution of influential predictors as in the original HRRR forecasts. This work provides a novel approach to post-process CAM output using neural networks that can be applied to severe weather prediction

arXiv.org e-Print Archive

Mimicking non-ideal instrument behavior for hologram processing using neural style translation

Author: Bansemer Aaron
Gagne David John
Gantos Gabrielle
Hayman Matthew
Schreck John S.
Publication venue: 'The Optical Society'
Publication date: 06/01/2023
Field of study

Holographic cloud probes provide unprecedented information on cloud particle density, size and position. Each laser shot captures particles within a large volume, where images can be computationally refocused to determine particle size and shape. However, processing these holograms, either with standard methods or with machine learning (ML) models, requires considerable computational resources, time and occasional human intervention. ML models are trained on simulated holograms obtained from the physical model of the probe since real holograms have no absolute truth labels. Using another processing method to produce labels would be subject to errors that the ML model would subsequently inherit. Models perform well on real holograms only when image corruption is performed on the simulated images during training, thereby mimicking non-ideal conditions in the actual probe (Schreck et. al, 2022). Optimizing image corruption requires a cumbersome manual labeling effort. Here we demonstrate the application of the neural style translation approach (Gatys et. al, 2016) to the simulated holograms. With a pre-trained convolutional neural network (VGG-19), the simulated holograms are ``stylized'' to resemble the real ones obtained from the probe, while at the same time preserving the simulated image ``content'' (e.g. the particle locations and sizes). Two image similarity metrics concur that the stylized images are more like real holograms than the synthetic ones. With an ML model trained to predict particle locations and shapes on the stylized data sets, we observed comparable performance on both simulated and real holograms, obviating the need to perform manual labeling. The described approach is not specific to hologram images and could be applied in other domains for capturing noise and imperfections in observational instruments to make simulated data more like real world observations.Comment: 23 pages, 9 figure

arXiv.org e-Print Archive

Physically Explainable Deep Learning for Convective Initiation Nowcasting Using GOES-16 Satellite Observations

Author: Clothiaux Eugene E.
Fan Da
Gagne II David John
Greybush Steven J.
Publication venue
Publication date: 24/10/2023
Field of study

Convection initiation (CI) nowcasting remains a challenging problem for both numerical weather prediction models and existing nowcasting algorithms. In this study, object-based probabilistic deep learning models are developed to predict CI based on multichannel infrared GOES-R satellite observations. The data come from patches surrounding potential CI events identified in Multi-Radar Multi-Sensor Doppler weather radar products over the Great Plains region from June and July 2020 and June 2021. An objective radar-based approach is used to identify these events. The deep learning models significantly outperform the classical logistic model at lead times up to 1 hour, especially on the false alarm ratio. Through case studies, the deep learning model exhibits the dependence on the characteristics of clouds and moisture at multiple levels. Model explanation further reveals the model's decision-making process with different baselines. The explanation results highlight the importance of moisture and cloud features at different levels depending on the choice of baseline. Our study demonstrates the advantage of using different baselines in further understanding model behavior and gaining scientific insights

arXiv.org e-Print Archive

Machine Learning for Stochastic Parameterization: Generative Adversarial Networks in the Lorenz '96 Model

Author: Christensen Hannah M.
Gagne II David John
Monahan Adam H.
Subramanian Aneesh C.
Publication venue: 'American Geophysical Union (AGU)'
Publication date: 10/09/2019
Field of study

Stochastic parameterizations account for uncertainty in the representation of unresolved sub-grid processes by sampling from the distribution of possible sub-grid forcings. Some existing stochastic parameterizations utilize data-driven approaches to characterize uncertainty, but these approaches require significant structural assumptions that can limit their scalability. Machine learning models, including neural networks, are able to represent a wide range of distributions and build optimized mappings between a large number of inputs and sub-grid forcings. Recent research on machine learning parameterizations has focused only on deterministic parameterizations. In this study, we develop a stochastic parameterization using the generative adversarial network (GAN) machine learning framework. The GAN stochastic parameterization is trained and evaluated on output from the Lorenz '96 model, which is a common baseline model for evaluating both parameterization and data assimilation techniques. We evaluate different ways of characterizing the input noise for the model and perform model runs with the GAN parameterization at weather and climate timescales. Some of the GAN configurations perform better than a baseline bespoke parameterization at both timescales, and the networks closely reproduce the spatio-temporal correlations and regimes of the Lorenz '96 system. We also find that in general those models which produce skillful forecasts are also associated with the best climate simulations.Comment: Submitted to Journal of Advances in Modeling Earth Systems (JAMES

arXiv.org e-Print Archive

Oxford University Research Archive

Recommended from our members

Neural Network Emulation of the Formation of Organic Aerosols Based on the Explicit GECKO-A Chemistry Model

Author: Becker Charles
Choi Jinkyul
Gagne David John
Hodzic Alma
Lawrence Keely
Mouchel-Vallon Camille
Schreck John S
Wang Siyuan
Publication venue
Publication date: 01/01/2022
Field of study

Secondary organic aerosols (SOA) are formed from oxidation of hundreds of volatile organic compounds (VOCs) emitted from anthropogenic and natural sources. Accurate predictions of this chemistry are key for air quality and climate studies due to the large contribution of organic aerosols to submicron aerosol mass. Currently, only explicit models, such as the Generator for Explicit Chemistry and Kinetics of Organics in the Atmosphere (GECKO-A), can fully represent the chemical processing of thousands of organic species. However, their extreme computational cost prohibits their use in current chemistry-climate models, which rely on simplified empirical parameterizations to predict SOA concentrations. This study demonstrates that machine learning can accurately emulate SOA formation from an explicit chemistry model with an approximate error of 2%–8%, up to five days for several precursors and for potentially up to one month for recurrent neural network models, and with 100 to 100,000 times speedup over GECKO-A, making it computationally useable in a chemistry-climate model. We generated the training data using thousands of GECKO-A box simulations sampled from a broad range of initial environmental conditions, and focused on three representative SOA precursors: the oxidation by OH of two anthropogenic (toluene, dodecane), and the oxidation by O3 of one biogenic VOC (α-pinene). We compare several neural models and quantify their underlying uncertainty and robustness. These are promising results, suggesting that neural network models could be applied to predict SOA in chemistry-climate models, limited however to the range of environmental conditions that were considered in the training datasets.  </p

CU Scholar Institutional Repository

Evidential Deep Learning: Enhancing Predictive Uncertainty Estimation for Earth System Science Applications

Author: Becker Charlie
Chapman William E.
Elmore Kim
Gagne II David John
Gantos Gabrielle
Kim Eliot
Kimpara Dhamma
Martin Thomas
Molina Maria J.
Pryzbylo Vanessa M.
Radford Jacob
Saavedra Belen
Schreck John S.
Willson Justin
Wirz Christopher
Publication venue
Publication date: 22/09/2023
Field of study

Robust quantification of predictive uncertainty is critical for understanding factors that drive weather and climate outcomes. Ensembles provide predictive uncertainty estimates and can be decomposed physically, but both physics and machine learning ensembles are computationally expensive. Parametric deep learning can estimate uncertainty with one model by predicting the parameters of a probability distribution but do not account for epistemic uncertainty.. Evidential deep learning, a technique that extends parametric deep learning to higher-order distributions, can account for both aleatoric and epistemic uncertainty with one model. This study compares the uncertainty derived from evidential neural networks to those obtained from ensembles. Through applications of classification of winter precipitation type and regression of surface layer fluxes, we show evidential deep learning models attaining predictive accuracy rivaling standard methods, while robustly quantifying both sources of uncertainty. We evaluate the uncertainty in terms of how well the predictions are calibrated and how well the uncertainty correlates with prediction error. Analyses of uncertainty in the context of the inputs reveal sensitivities to underlying meteorological processes, facilitating interpretation of the models. The conceptual simplicity, interpretability, and computational efficiency of evidential neural networks make them highly extensible, offering a promising approach for reliable and practical uncertainty quantification in Earth system science modeling. In order to encourage broader adoption of evidential deep learning in Earth System Science, we have developed a new Python package, MILES-GUESS (https://github.com/ai2es/miles-guess), that enables users to train and evaluate both evidential and ensemble deep learning

arXiv.org e-Print Archive

The importance of sea ice area biases in 21st century multimodel projections of Antarctic temperature and precipitation

Author: Arzel
Bracegirdle
Bracegirdle
Bromwich
Church
Collins
Comiso
David B. Stephenson
Di Luca
Eisenman
Flato
Flato
Frieler
Gagne
Genthon
Gregory
John Turner
Kennicutt
Krinner
Lipzig
Meier
Nicolas
Racherla
Raisanen
Rind
Scinocca
Simmonds
Taylor
Thomas J. Bracegirdle
Tony Phillips
Turner
Weatherly
Publication venue: 'Wiley'
Publication date: 19/12/2015
Field of study

Climate models exhibit large biases in sea ice area (SIA) in their historical simulations. This study explores the impacts of these biases on multimodel uncertainty in Coupled Model Intercomparison Project phase 5 (CMIP5) ensemble projections of 21st century change in Antarctic surface temperature, net precipitation, and SIA. The analysis is based on time slice climatologies in the Representative Concentration Pathway 8.5 future scenario (2070–2099) and historical (1970–1999) simulations across 37 different CMIP5 models. Projected changes in net precipitation, temperature, and SIA are found to be strongly associated with simulated historical mean SIA (e.g., cross-model correlations of r = 0.77, 0.71, and −0.85, respectively). Furthermore, historical SIA bias is found to have a large impact on the simulated ratio between net precipitation response and temperature response. This ratio is smaller in models with smaller-than-observed SIA. These strong emergent relationships on SIA bias could, if found to be physically robust, be exploited to give more precise climate projections for Antarctica

Crossref

NERC Open Research Archive