1,501 research outputs found

    Fast increased fidelity approximate Gibbs samplers for Bayesian Gaussian process regression

    Full text link
    The use of Gaussian processes (GPs) is supported by efficient sampling algorithms, a rich methodological literature, and strong theoretical grounding. However, due to their prohibitive computation and storage demands, the use of exact GPs in Bayesian models is limited to problems containing at most several thousand observations. Sampling requires matrix operations that scale at O(n3),\mathcal{O}(n^3), where nn is the number of unique inputs. Storage of individual matrices scales at O(n2),\mathcal{O}(n^2), and can quickly overwhelm the resources of most modern computers. To overcome these bottlenecks, we develop a sampling algorithm using H\mathcal{H} matrix approximation of the matrices comprising the GP posterior covariance. These matrices can approximate the true conditional covariance matrix within machine precision and allow for sampling algorithms that scale at \mathcal{O}(n \ \mbox{log}^2 n) time and storage demands scaling at \mathcal{O}(n \ \mbox{log} \ n). We also describe how these algorithms can be used as building blocks to model higher dimensional surfaces at \mathcal{O}(d \ n \ \mbox{log}^2 n), where dd is the dimension of the surface under consideration, using tensor products of one-dimensional GPs. Though various scalable processes have been proposed for approximating Bayesian GP inference when nn is large, to our knowledge, none of these methods show that the approximation's Kullback-Leibler divergence to the true posterior can be made arbitrarily small and may be no worse than the approximation provided by finite computer arithmetic. We describe H\mathcal{H}-matrices, give an efficient Gibbs sampler using these matrices for one-dimensional GPs, offer a proposed extension to higher dimensional surfaces, and investigate the performance of this fast increased fidelity approximate GP, FIFA-GP, using both simulated and real data sets

    Bayesian joint modeling of chemical structure and dose response curves

    Full text link
    Today there are approximately 85,000 chemicals regulated under the Toxic Substances Control Act, with around 2,000 new chemicals introduced each year. It is impossible to screen all of these chemicals for potential toxic effects either via full organism in vivo studies or in vitro high-throughput screening (HTS) programs. Toxicologists face the challenge of choosing which chemicals to screen, and predicting the toxicity of as-yet-unscreened chemicals. Our goal is to describe how variation in chemical structure relates to variation in toxicological response to enable in silico toxicity characterization designed to meet both of these challenges. With our Bayesian partially Supervised Sparse and Smooth Factor Analysis (BS3FA) model, we learn a distance between chemicals targeted to toxicity, rather than one based on molecular structure alone. Our model also enables the prediction of chemical dose-response profiles based on chemical structure (that is, without in vivo or in vitro testing) by taking advantage of a large database of chemicals that have already been tested for toxicity in HTS programs. We show superior simulation performance in distance learning and modest to large gains in predictive ability compared to existing methods. Results from the high-throughput screening data application elucidate the relationship between chemical structure and a toxicity-relevant high-throughput assay. An R package for BS3FA is available online at https://github.com/kelrenmor/bs3fa

    Bayesian Hierarchical Factor Regression Models to Infer Cause of Death From Verbal Autopsy Data

    Full text link
    In low-resource settings where vital registration of death is not routine it is often of critical interest to determine and study the cause of death (COD) for individuals and the cause-specific mortality fraction (CSMF) for populations. Post-mortem autopsies, considered the gold standard for COD assignment, are often difficult or impossible to implement due to deaths occurring outside the hospital, expense, and/or cultural norms. For this reason, Verbal Autopsies (VAs) are commonly conducted, consisting of a questionnaire administered to next of kin recording demographic information, known medical conditions, symptoms, and other factors for the decedent. This article proposes a novel class of hierarchical factor regression models that avoid restrictive assumptions of standard methods, allow both the mean and covariance to vary with COD category, and can include covariate information on the decedent, region, or events surrounding death. Taking a Bayesian approach to inference, this work develops an MCMC algorithm and validates the FActor Regression for Verbal Autopsy (FARVA) model in simulation experiments. An application of FARVA to real VA data shows improved goodness-of-fit and better predictive performance in inferring COD and CSMF over competing methods. Code and a user manual are made available at https://github.com/kelrenmor/farva

    Estimates of Micro-, Nano-, and Picoplankton Contributions to Particle Export in the Northeast Pacific

    Get PDF
    The contributions of micro-, nano-, and picoplankton to particle export were estimated from measurements of size-fractionated particulate 234Th, organic carbon, and phytoplankton indicator pigments obtained during five cruises between 2010 and 2012 along Line P in the subarctic northeast Pacific Ocean. Sinking fluxes of particulate organic carbon (POC) and indicator pigments were calculated from 234Th–238U disequilibria and, during two cruises, measured by sediment trap at Ocean Station Papa. POC fluxes at 100 m ranged from 0.65–7.95 mmol m−2 d−1, similar in magnitude to previous results at Line P. Microplankton pigments dominate indicator pigment fluxes (averaging 69 ± 19% of total pigment flux), while nanoplankton pigments comprised the majority of pigment standing stocks (averaging 64 ± 23% of total pigment standing stock). Indicator pigment loss rates (the ratio of pigment export flux to pigment standing stock) point to preferential export of larger microplankton relative to smaller nano- and picoplankton. However, indicator pigments do not quantitatively trace particle export resulting from zooplankton grazing, which may be an important pathway for the export of small phytoplankton. These results have important implications for understanding the magnitude and mechanisms controlling the biological pump at Line P in particular, and more generally in oligotrophic gyres and high-nutrient, low-chlorophyll regions where small phytoplankton represent a major component of the autotrophic community

    Cosmic-Enu: An emulator for the non-linear neutrino power spectrum

    Full text link
    Cosmology is poised to measure the neutrino mass sum MνM_\nu and has identified several smaller-scale observables sensitive to neutrinos, necessitating accurate predictions of neutrino clustering over a wide range of length scales. The FlowsForTheMasses non-linear perturbation theory for the massive neutrino power spectrum, Δν2(k)\Delta^2_\nu(k), agrees with its companion N-body simulation at the 10%15%10\%-15\% level for k1 h/k \leq 1~h/Mpc. Building upon the Mira-Titan IV emulator for the cold matter, we use FlowsForTheMasses to construct an emulator for Δν2(k)\Delta^2_\nu(k) covering a large range of cosmological parameters and neutrino fractions Ων,0h20.01\Omega_{\nu,0} h^2 \leq 0.01, which corresponds to Mν0.93M_\nu \leq 0.93~eV. Consistent with FlowsForTheMasses at the 3.5%3.5\% level, it returns a power spectrum in milliseconds. Ranking the neutrinos by initial momenta, we also emulate the power spectra of momentum deciles, providing information about their perturbed distribution function. Comparing a Mν=0.15M_\nu=0.15~eV model to a wide range of N-body simulation methods, we find agreement to 3%3\% for k3kFS=0.17 h/k \leq 3 k_\mathrm{FS} = 0.17~h/Mpc and to 19%19\% for k0.4 h/k \leq 0.4~h/Mpc. We find that the enhancement factor, the ratio of Δν2(k)\Delta^2_\nu(k) to its linear-response equivalent, is most strongly correlated with Ων,0h2\Omega_{\nu,0} h^2, and also with the clustering amplitude σ8\sigma_8. Furthermore, non-linearities enhance the free-streaming-limit scaling log(Δν2/Δm2)/log(Mν)\partial \log(\Delta^2_\nu / \Delta^2_{\rm m}) / \partial \log(M_\nu) beyond its linear value of 4, increasing the MνM_\nu-sensitivity of the small-scale neutrino density.Comment: 17 pages, 14 figures, 3 tables. Emulator code available at: https://github.com/upadhye/Cosmic-En

    Empirical Validation of a New Data Product from the Interstellar Boundary Explorer Satellite

    Full text link
    Since 2008, the Interstellar Boundary Explorer (IBEX) satellite has been gathering data on heliospheric energetic neutral atoms (ENAs) while being exposed to various sources of background noise, such as cosmic rays and solar energetic particles. The IBEX mission initially released only a qualified triple-coincidence (qABC) data product, which was designed to provide observations of ENAs free of background contamination. Further measurements revealed that the qABC data was in fact susceptible to contamination, having relatively low ENA counts and high background rates. Recently, the mission team considered releasing a certain qualified double-coincidence (qBC) data product, which has roughly twice the detection rate of the qABC data product. This paper presents a simulation-based validation of the new qBC data product against the already-released qABC data product. The results show that the qBCs can plausibly be said to share the same signal rate as the qABCs up to an average absolute deviation of 3.6%. Visual diagnostics at an orbit, map, and full mission level provide additional confirmation of signal rate coherence across data products. These approaches are generalizable to other scenarios in which one wishes to test whether multiple observations could plausibly be generated by some underlying shared signal

    Assimilating Remote Sensing Observations of Leaf Area Index and Soil Moisture for Wheat Yield Estimates: An Observing System Simulation Experiment

    Get PDF
    Observing system simulation experiments were used to investigate ensemble Bayesian state updating data assimilation of observations of leaf area index (LAI) and soil moisture (theta) for the purpose of improving single-season wheat yield estimates with the Decision Support System for Agrotechnology Transfer (DSSAT) CropSim-Ceres model. Assimilation was conducted in an energy-limited environment and a water-limited environment. Modeling uncertainty was prescribed to weather inputs, soil parameters and initial conditions, and cultivar parameters and through perturbations to model state transition equations. The ensemble Kalman filter and the sequential importance resampling filter were tested for the ability to attenuate effects of these types of uncertainty on yield estimates. LAI and theta observations were synthesized according to characteristics of existing remote sensing data, and effects of observation error were tested. Results indicate that the potential for assimilation to improve end-of-season yield estimates is low. Limitations are due to a lack of root zone soil moisture information, error in LAI observations, and a lack of correlation between leaf and grain growth

    Spectropolarimetric Evidence for Radiatively Inefficient Accretion in an Optically Dull Active Galaxy

    Full text link
    We present Subaru/FOCAS spectropolarimetry of two active galaxies in the Cosmic Evolution Survey. These objects were selected to be optically dull, with the bright X-ray emission of an AGN but missing optical emission lines in our previous spectroscopy. Our new observations show that one target has very weak emission lines consistent with an optically dull AGN, while the other object has strong emission lines typical of a host-diluted Type 2 Seyfert galaxy. In neither source do we observe polarized emission lines, with 3-sigma upper limits of P_BLR < 2%. This means that the missing broad emission lines (and weaker narrow emission lines) are not due to simple anisotropic obscuration, e.g., by the canonical AGN torus. The weak-lined optically dull AGN exhibits a blue polarized continuum with P = 0.78 +/- 0.07% at 4400 A < lambda_rest < 7200 A (P = 1.37 +/- 0.16% at 4400 A < lambda_rest < 5050 A). The wavelength dependence of this polarized flux is similar to that of an unobscured AGN continuum and represents the intrinsic AGN emission, either as synchrotron emission or the outer part of an accretion disk reflected by a clumpy dust scatterer. Because this intrinsic AGN emission lacks emission lines, this source is likely to have a radiatively inefficient accretion flow.Comment: Accepted to ApJ. 6 pages, 2 figure
    corecore