    Improving upon the Neyman-Person Approach to Testing Hypotheses

    Bayesian optimization for materials design

    We introduce Bayesian optimization, a technique developed for optimizing time-consuming engineering simulations and for fitting machine learning models on large datasets. Bayesian optimization guides the choice of experiments during materials design and discovery to find good material designs in as few experiments as possible. We focus on the case when materials designs are parameterized by a low-dimensional vector. Bayesian optimization is built on a statistical technique called Gaussian process regression, which allows predicting the performance of a new design based on previously tested designs. After providing a detailed introduction to Gaussian process regression, we introduce two Bayesian optimization methods: expected improvement, for design problems with noise-free evaluations; and the knowledge-gradient method, which generalizes expected improvement and may be used in design problems with noisy evaluations. Both methods are derived using a value-of-information analysis, and enjoy one-step Bayes-optimality

    National CO2 budgets (2015–2020) inferred from atmospheric CO2 observations in support of the Global Stocktake

    Accurate accounting of emissions and removals of CO2 is critical for the planning and verification of emission reduction targets in support of the Paris Agreement. Here, we present a pilot dataset of country-specific net carbon exchange (NCE; fossil plus terrestrial ecosystem fluxes) and terrestrial carbon stock changes aimed at informing countries’ carbon budgets. These estimates are based on "top-down" NCE outputs from the v10 Orbiting Carbon Observatory (OCO-2) modeling intercomparison project (MIP), wherein an ensemble of inverse modeling groups conducted standardized experiments assimilating OCO-2 column-averaged dry-air mole fraction (XCO2) retrievals (ACOS v10), in situ CO2 measurements, or combinations of these data. The v10 OCO-2 MIP NCE estimates are combined with "bottom-up" estimates of fossil fuel emissions and lateral carbon fluxes to estimate changes in terrestrial carbon stocks, which are impacted by anthropogenic and natural drivers. These flux and stock change estimates are reported annually (2015–2020) as both a global 1° × 1° gridded dataset and as a country-level dataset. Across the v10 OCO-2 MIP experiments, we obtain increases in the ensemble median terrestrial carbon stocks of 3.29–4.58 PgCO2 yr-1 (0.90–1.25 PgC yr-1). This is a result of broad increases in terrestrial carbon stocks across the northern extratropics, while the tropics generally have stock losses but with considerable regional variability and differences between v10 OCO-2 MIP experiments. We discuss the state of the science for tracking emissions and removals using top-down methods, including current limitations and future developments towards top-down monitoring and verification systems

    Mines and mine-like objects are distributed throughout an area of interest. Remote sensing of the area form an aircraft yields image data that represent the superposition of electromagnetic emissions from the mines and mine-like objects. In this article we build a hierarchical statistical model for the reconstruction of mien locations given a point pattern of the superposition of mines and mine-like objects. It is shown how inference on the mine locations can be obtained using Markov chain Monte Carlo methods

    Based on remote sensing of a potential minefield, point locations are identified, some of which may not be mines. The mines and mine-like objects are to be distinguished based on their point patterns, although it must be emphasized that all we see is the superposition of their locations. In this paper, we construct a hierarchical spatial point-process model that accounts for the different patterns of mines and mine-like objects and uses posterior analysis to distinguish between them. Our Bayesian approach is applied to COBRA image data obtained from the NSWC Coastal Systems Station, Dahlgren Division, Panama City, Florida. 2003 Copyright SPIE - The International Society for Optical Engineering

    Hierarchical modeling of count data with application to nuclear fall-out

    Under more general assumptions than those usually made in the sequential analysis literature, a variable-sample-size-sequential probability ratio test (VPRT) of two simple hypotheses is found that maximizes the expected net gain over all sequential decision procedures. In contrast, Wald and Wolfowitz [25] developed the sequential probability ratio test (SPRT) to minimize expected sample size, but their assumptions on the parameters of the decision problem were restrictive. In this article we show that the expected net-gain-maximizing VPRT also minimizes the expected (with respect to both data and prior) total sampling cost and that, under slightly more general conditions than those imposed by Wald and Wolfowitz, it reduces to the one-observation-at-a-time sequential probability ratio test (SPRT). The ways in which the size and power of the VPRT depend upon the parameters of the decision problem are also examined.

    The analysis of a spatial point pattern is often involved with looking for spatial structure, such as clustering or regularity in the points (or events). For example, it is of biological interest to characterize the pattern of tree locations in a forest. This has traditionally been done using global summaries, such as the K-function or its differential, the product density function. In this article, we define a local version of the product density function for each event, derived under a definition of a local indicator of spatial association (LISA). These product density LISA functions can then be grouped into bundles of similar functions using multivariate hierarchical clustering techniques. The bundles can then be visualized by a replotting of the data, obtained via classical multidimensional scaling of the statistical distances between functions. Thus, we propose a different way of looking for structure based on how an event relates to nearby events. We apply this method to a point pattern of pine saplings in a Finnish forest and show remarkable, heretofore undiscovered, spatial structure in the data