1,442 research outputs found

    Visual Analysis of Variability and Features of Climate Simulation Ensembles

    Get PDF
    This PhD thesis is concerned with the visual analysis of time-dependent scalar field ensembles as occur in climate simulations. Modern climate projections consist of multiple simulation runs (ensemble members) that vary in parameter settings and/or initial values, which leads to variations in the resulting simulation data. The goal of ensemble simulations is to sample the space of possible futures under the given climate model and provide quantitative information about uncertainty in the results. The analysis of such data is challenging because apart from the spatiotemporal data, also variability has to be analyzed and communicated. This thesis presents novel techniques to analyze climate simulation ensembles visually. A central question is how the data can be aggregated under minimized information loss. To address this question, a key technique applied in several places in this work is clustering. The first part of the thesis addresses the challenge of finding clusters in the ensemble simulation data. Various distance metrics lend themselves for the comparison of scalar fields which are explored theoretically and practically. A visual analytics interface allows the user to interactively explore and compare multiple parameter settings for the clustering and investigate the resulting clusters, i.e. prototypical climate phenomena. A central contribution here is the development of design principles for analyzing variability in decadal climate simulations, which has lead to a visualization system centered around the new Clustering Timeline. This is a variant of a Sankey diagram that utilizes clustering results to communicate climatic states over time coupled with ensemble member agreement. It can reveal several interesting properties of the dataset, such as: into how many inherently similar groups the ensemble can be divided at any given time, whether the ensemble diverges in general, whether there are different phases in the time lapse, maybe periodicity, or outliers. The Clustering Timeline is also used to compare multiple climate simulation models and assess their performance. The Hierarchical Clustering Timeline is an advanced version of the above. It introduces the concept of a cluster hierarchy that may group the whole dataset down to the individual static scalar fields into clusters of various sizes and densities recording the nesting relationship between them. One more contribution of this work in terms of visualization research is, that ways are investigated how to practically utilize a hierarchical clustering of time-dependent scalar fields to analyze the data. To this end, a system of different views is proposed which are linked through various interaction possibilities. The main advantage of the system is that a dataset can now be inspected at an arbitrary level of detail without having to recompute a clustering with different parameters. Interesting branches of the simulation can be expanded to reveal smaller differences in critical clusters or folded to show only a coarse representation of the less interesting parts of the dataset. The last building block of the suit of visual analysis methods developed for this thesis aims at a robust, (largely) automatic detection and tracking of certain features in a scalar field ensemble. Techniques are presented that I found can identify and track super- and sub-levelsets. And I derive “centers of action” from these sets which mark the location of extremal climate phenomena that govern the weather (e.g. Icelandic Low and Azores High). The thesis also presents visual and quantitative techniques to evaluate the temporal change of the positions of these centers; such a displacement would be likely to manifest in changes in weather. In a preliminary analysis with my collaborators, we indeed observed changes in the loci of the centers of action in a simulation with increased greenhouse gas concentration as compared to pre-industrial concentration levels

    Stable reliability diagrams for probabilistic classifiers

    Get PDF
    A probability forecast or probabilistic classifier is reliable or calibrated if the predicted probabilities are matched by ex post observed frequencies, as examined visually in reliability diagrams. The classical binning and counting approach to plotting reliability diagrams has been hampered by a lack of stability under unavoidable, ad hoc implementation decisions. Here, we introduce the CORP approach, which generates provably statistically consistent, optimally binned, and reproducible reliability diagrams in an automated way. CORP is based on nonparametric isotonic regression and implemented via the pool-adjacent-violators (PAV) algorithm—essentially, the CORP reliability diagram shows the graph of the PAV-(re)calibrated forecast probabilities. The CORP approach allows for uncertainty quantification via either resampling techniques or asymptotic theory, furnishes a numerical measure of miscalibration, and provides a CORP-based Brier-score decomposition that generalizes to any proper scoring rule. We anticipate that judicious uses of the PAV algorithm yield improved tools for diagnostics and inference for a very wide range of statistical and machine learning methods

    Data Mining in Electronic Commerce

    Full text link
    Modern business is rushing toward e-commerce. If the transition is done properly, it enables better management, new services, lower transaction costs and better customer relations. Success depends on skilled information technologists, among whom are statisticians. This paper focuses on some of the contributions that statisticians are making to help change the business world, especially through the development and application of data mining methods. This is a very large area, and the topics we cover are chosen to avoid overlap with other papers in this special issue, as well as to respect the limitations of our expertise. Inevitably, electronic commerce has raised and is raising fresh research problems in a very wide range of statistical areas, and we try to emphasize those challenges.Comment: Published at http://dx.doi.org/10.1214/088342306000000204 in the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Application of neural networks to subseasonal to seasonal predictability in present and future climates

    Get PDF
    Includes bibliographical references.2022 Fall.The Earth system is known for its lack of predictability on subseasonal to seasonal timescales (S2S; 2 weeks to a season). Yet accurate predictions on these timescales provide crucial, actionable lead times for agriculture, energy, and water management sectors. Fortunately, specific Earth system states – deemed forecasts of opportunity – can be leveraged to improve prediction skill. Our current understanding of these opportunities are rooted in our knowledge of the historical climate. Depending on societal actions, the future climate could vary drastically, and these possible futures could lead to varying changes to S2S predictability. In recent years, neural networks have been successfully applied to weather and climate prediction. With the rapid development of neural network explainability techniques, the application of neural networks now provides an opportunity to further understand our climate system as well. The research presented here demonstrates the utility of explainable neural networks for S2S prediction and predictability changes under future climates. The first study presents a novel approach for identifying forecasts of opportunity in observations using neural network confidence. It further demonstrates that neural networks can be used to gain physical insight into predictability, through neural network explainability techniques. We then employ this methodology to explore S2S predictability differences in two future scenarios: under anthropogenic climate change and stratospheric aerosol injection (SAI). In particular, we explore subseasonal predictability and forecasts of opportunity changes under anthropogenic warming compared to a historical climate in the CESM2-LE. We then investigate how future seasonal predictability may differ under SAI compared to a future without SAI deployment in the ARISE-SAI simulations. We find differences in predictability between the historical and future climates and the two future scenarios, respectively, where the largest differences in skill generally occur during forecasts of opportunity. This demonstrates that the forecast of opportunity approach, presented in the first study, is useful for identifying differences in future S2S predictability that may not have been identified if examining predictability across all predictions. Overall, these results demonstrate that neural networks are useful tools for exploring subseasonal to seasonal predictability, its sources, and future changes

    Physically Explainable Deep Learning for Convective Initiation Nowcasting Using GOES-16 Satellite Observations

    Full text link
    Convection initiation (CI) nowcasting remains a challenging problem for both numerical weather prediction models and existing nowcasting algorithms. In this study, object-based probabilistic deep learning models are developed to predict CI based on multichannel infrared GOES-R satellite observations. The data come from patches surrounding potential CI events identified in Multi-Radar Multi-Sensor Doppler weather radar products over the Great Plains region from June and July 2020 and June 2021. An objective radar-based approach is used to identify these events. The deep learning models significantly outperform the classical logistic model at lead times up to 1 hour, especially on the false alarm ratio. Through case studies, the deep learning model exhibits the dependence on the characteristics of clouds and moisture at multiple levels. Model explanation further reveals the model's decision-making process with different baselines. The explanation results highlight the importance of moisture and cloud features at different levels depending on the choice of baseline. Our study demonstrates the advantage of using different baselines in further understanding model behavior and gaining scientific insights

    Assessing the Predictability of Convection Initiation Using an Object-Based Approach

    Get PDF
    Improvements in numerical forecasts of deep, moist convection have been notable in recent years and are in large part due to increased computational power allowing for the explicit numerical representation of convection. Accurately forecasting the timing and location of convection initiation (CI), however, remains a substantial forecast challenge. This is attributed to the inherently limited intrinsic predictability of CI due to its dependence on highly non-linear moist physics and fine-scale atmospheric processes that are poorly represented in observations. Because CI is the starting point of deep, moist convection that grows upscale, even small errors in initial convective development can rapidly spread to larger scales, having potentially significant impacts on downstream forecasts. This study investigates the practical predictability of CI using the Advanced Research Weather Research and Forecasting (WRF-ARW) model with a horizontal grid spacing of 429 meters. A unique object-based method is used to evaluate very high-resolution model performance for twenty-five cases of CI across the west-central High Plains of the United States from the 2010 convective season. CI objects are defined as areas of higher observed or model simulated radar reflectivity that develop and remain sustained for a sufficient period of time. Model simulations demonstrate an average probability of detection of 0.835, but due to significant overproduction of CI, an average false alarm ratio of 0.664 and bias ratio of 2.49. The average critical success index through all simulations is 0.315. Model CI objects that are matched with observed CI objects show, on average, an early bias of about 7 minutes and distance errors of around 62 kilometers. The operational utility and inherent biases of such high-resolution simulations are discussed
    corecore