52 research outputs found

    A Hierarchical Spatio-Temporal Statistical Model Motivated by Glaciology

    Get PDF
    In this paper, we extend and analyze a Bayesian hierarchical spatio-temporal model for physical systems. A novelty is to model the discrepancy between the output of a computer simulator for a physical process and the actual process values with a multivariate random walk. For computational efficiency, linear algebra for bandwidth limited matrices is utilized, and first-order emulator inference allows for the fast emulation of a numerical partial differential equation (PDE) solver. A test scenario from a physical system motivated by glaciology is used to examine the speed and accuracy of the computational methods used, in addition to the viability of modeling assumptions. We conclude by discussing how the model and associated methodology can be applied in other physical contexts besides glaciology.Comment: Revision accepted for publication by the Journal of Agricultural, Biological, and Environmental Statistic

    Max-and-Smooth: a two-step approach for approximate Bayesian inference in latent Gaussian models

    Get PDF
    This is the final version. Available on open access from International Society for Bayesian Analysis (ISBA) via the DOI in this record. With modern high-dimensional data, complex statistical models are necessary, requiring computationally feasible inference schemes. We introduce Max-and-Smooth, an approximate Bayesian inference scheme for a flexible class of latent Gaussian models (LGMs) where one or more of the likelihood parameters are modeled by latent additive Gaussian processes. Max-and-Smooth consists of two-steps. In the first step (Max), the likelihood function is approximated by a Gaussian density with mean and covariance equal to either (a) the maximum likelihood estimate and the inverse observed information, respectively, or (b) the mean and covariance of the normalized likelihood function. In the second step (Smooth), the latent parameters and hyperparameters are inferred and smoothed with the approximated likelihood function. The proposed method ensures that the uncertainty from the first step is correctly propagated to the second step. Since the approximated likelihood function is Gaussian, the approximate posterior density of the latent parameters of the LGM (conditional on the hyperparameters) is also Gaussian, thus facilitating efficient posterior inference in high dimensions. Furthermore, the approximate marginal posterior distribution of the hyperparameters is tractable, and as a result, the hyperparameters can be sampled independently of the latent parameters. In the case of a large number of independent data replicates, sparse precision matrices, and high-dimensional latent vectors, the speedup is substantial in comparison to an MCMC scheme that infers the posterior density from the exact likelihood function. The proposed inference scheme is demonstrated on one spatially referenced real dataset and on simulated data mimicking spatial, temporal, and spatio-temporal inference problems. Our results show that Max-and-Smooth is accurate and fast.NER

    Approximate Bayesian inference for analysis of spatiotemporal flood frequency data

    Get PDF
    This is the final version. Available from the Institute of Mathematical Statistics via the DOI in this recordExtreme floods cause casualties and widespread damage to property and vital civil infrastructure. Predictions of extreme floods, within gauged and ungauged catchments, is crucial to mitigate these disasters. In this paper a Bayesian framework is proposed for predicting extreme floods, using the generalized extreme-value (GEV) distribution. A major methodological challenge is to find a suitable parametrization for the GEV distribution when multiple covariates and/or latent spatial effects are involved and a time trend is present. Other challenges involve balancing model complexity and parsimony, using an appropriate model selection procedure and making inference based on a reliable and computationally efficient approach. We here propose a latent Gaussian modeling framework with a novel multivariate link function designed to separate the interpretation of the parameters at the latent level and to avoid unreasonable estimates of the shape and time trend parameters. Structured additive regression models, which include catchment descriptors as covariates and spatially correlated model components, are proposed for the four parameters at the latent level. To achieve computational efficiency with large datasets and richly parametrized models, we exploit a highly accurate and fast approximate Bayesian inference approach which can also be used to efficiently select models separately for each of the four regression models at the latent level. We applied our proposed methodology to annual peak river flow data from 554 catchments across the United Kingdom. The framework performed well in terms of flood predictions for both ungauged catchments and future observations at gauged catchments. The results show that the spatial model components for the transformed location and scale parameters as well as the time trend are all important, and none of these should be ignored. Posterior estimates of the time trend parameters correspond to an average increase of about 1.5% per decade with range 0.1% to 2.8% and reveal a spatial structure across the United Kingdom. When the interest lies in estimating return levels for spatial aggregates, we further develop a novel copula-based postprocessing approach of posterior predictive samples in order to mitigate the effect of the conditional independence assumption at the data level, and we demonstrate that our approach indeed provides accurate results.University of Iceland Research Fun

    Approximate Bayesian inference for analysis of spatio-temporal flood frequency data

    Get PDF
    This is the final version. Available from the Institute of Mathematical Statistics via the DOI in this record. Extreme floods cause casualties, and widespread damage to property and vital civil infrastructure. We here propose a Bayesian approach for predicting extreme floods using the generalized extreme-value (GEV) distribution within gauged and ungauged catchments. A major methodological challenge is to find a suitable parametrization for the GEV distribution when covariates or latent spatial effects are involved. Other challenges involve balancing model complexity and parsimony using an appropriate model selection procedure, and making inference using a reliable and computationally efficient approach. Our approach relies on a latent Gaussian modeling framework with a novel multivariate link function designed to separate the interpretation of the parameters at the latent level and to avoid unreasonable estimates of the shape and time trend parameters. Structured additive regression models are proposed for the four parameters at the latent level. For computational efficiency with large datasets and richly parametrized models, we exploit an accurate and fast approximate Bayesian inference approach. We applied our proposed methodology to annual peak river flow data from 554 catchments across the United Kingdom (UK). Our model performed well in terms of flood predictions for both gauged and ungauged catchments. The results show that the spatial model components for the transformed location and scale parameters, and the time trend, are all important. Posterior estimates of the time trend parameters correspond to an average increase of about 1.5%1.5\% per decade and reveal a spatial structure across the UK. To estimate return levels for spatial aggregates, we further develop a novel copula-based post-processing approach of posterior predictive samples, in order to mitigate the effect of the conditional independence assumption at the data level, and we show that our approach provides accurate results.University of Iceland Research Fun

    A Bayesian hierarchical model for glacial dynamics based on the shallow ice approximation and its evaluation using analytical solutions

    Get PDF
    Bayesian hierarchical modeling can assist the study of glacial dynamics and ice flow properties. This approach will allow glaciologists to make fully probabilistic predictions for the thickness of a glacier at unobserved spatiotemporal coordinates, and it will also allow for the derivation of posterior probability distributions for key physical parameters such as ice viscosity and basal sliding. The goal of this paper is to develop a proof of concept for a Bayesian hierarchical model constructed, which uses exact analytical solutions for the shallow ice approximation (SIA) introduced by Bueler et al. (2005). A suite of test simulations utilizing these exact solutions suggests that this approach is able to adequately model numerical errors and produce useful physical parameter posterior distributions and predictions. A byproduct of the development of the Bayesian hierarchical model is the derivation of a novel finite difference method for solving the SIA partial differential equation (PDE). An additional novelty of this work is the correction of numerical errors induced through a numerical solution using a statistical model. This error-correcting process models numerical errors that accumulate forward in time and spatial variation of numerical errors between the dome, interior, and margin of a glacier.The Icelandic Research Fund (RANNIS) is thanked for funding this research.Peer Reviewe

    A Hierarchical Spatiotemporal Statistical Model Motivated by Glaciology

    Get PDF
    This is a post-peer-review, pre-copyedit version of an article published in Journal of Agricultural, Biological and Environmental Statistics. The final authenticated version is available online at: http://dx.doi.org/10.1007/s13253-019-00367-1In this paper, we extend and analyze a Bayesian hierarchical spatiotemporal model for physical systems. A novelty is to model the discrepancy between the output of a computer simulator for a physical process and the actual process values with a multivariate random walk. For computational efficiency, linear algebra for bandwidth limited matrices is utilized, and first-order emulator inference allows for the fast emulation of a numerical partial differential equation (PDE) solver. A test scenario from a physical system motivated by glaciology is used to examine the speed and accuracy of the computational methods used, in addition to the viability of modeling assumptions. We conclude by discussing how the model and associated methodology can be applied in other physical contexts besides glaciology.Icelandic Centre for Research (152457).Peer reviewe

    A statistical model for estimation of fish density including correlation in size, space, time and between species from research survey data

    Get PDF
    Trawl survey data with high spatial and seasonal coverage were analysed using a variant of the Log Gaussian Cox Process (LGCP) statistical model to estimate unbiased relative fish densities. The model estimates correlations between observations according to time, space, and fish size and includes zero observations and over-dispersion. The model utilises the fact the correlation between numbers of fish caught increases when the distance in space and time between the fish decreases, and the correlation between size groups in a haul increases when the difference in size decreases. Here the model is extended in two ways. Instead of assuming a natural scale size correlation, the model is further developed to allow for a transformed length scale. Furthermore, in the present application, the spatial- and size-dependent correlation between species was included. For cod (Gadus morhua) and whiting (Merlangius merlangus), a common structured size correlation was fitted, and a separable structure between the time and space-size correlation was found for each species, whereas more complex structures were required to describe the correlation between species (and space-size). The within-species time correlation is strong, whereas the correlations between the species are weaker over time but strong within the year

    GWAS of thyroid stimulating hormone highlights pleiotropic effects and inverse association with thyroid cancer

    Get PDF
    Thyroid stimulating hormone (TSH) is critical for normal development and metabolism. To better understand the genetic contribution to TSH levels, we conduct a GWAS meta-analysis at 22.4 million genetic markers in up to 119,715 individuals and identify 74 genome-wide significant loci for TSH, of which 28 are previously unreported. Functional experiments show that the thyroglobulin protein-altering variants P118L and G67S impact thyroglobulin secretion. Phenome-wide association analysis in the UK Biobank demonstrates the pleiotropic effects of TSH-associated variants and a polygenic score for higher TSH levels is associated with a reduced risk of thyroid cancer in the UK Biobank and three other independent studies. Two-sample Mendelian randomization using TSH index variants as instrumental variables suggests a protective effect of higher TSH levels (indicating lower thyroid function) on risk of thyroid cancer and goiter. Our findings highlight the pleiotropic effects of TSH-associated variants on thyroid function and growth of malignant and benign thyroid tumors

    Incidence of cancer among commercial airline pilots

    No full text
    OBJECTIVES—To describe the cancer pattern in a cohort of commercial pilots by follow up through the Icelandic Cancer Registry.
METHODS—This is a retrospective cohort study of 458 pilots with emphasis on subcohort working for an airline operating on international routes. A computerised file of the cohort was record linked to the Cancer Registry by making use of personal identification numbers. Expected numbers of cancer cases were calculated on the basis of number of person-years and incidences of cancer at specific sites for men provided by the Cancer Registry. Numbers of separate analyses were made according to different exposure variables.
RESULTS—The standardised incidence ratio (SIR) for all cancers was 0.97 (95% confidence interval (95% CI) 0.62 to 1.46) in the total cohort and 1.16 (95% CI 0.70 to 1.81) among those operating on international routes. The SIR for malignant melanoma of the skin was 10.20, 95% CI 3.29 to 23.81 in the total cohort and 15.63,( )95% CI 5.04 to 36.46 in the restricted cohort. Analyses according to number of block-hours and radiation dose showed that malignant melanomas were found in the subgroups with highest exposure estimates, the SIRs were 13.04 and 28.57 respectively. The SIR was 25.00( )for malignant melanoma among those who had been flying over five time zones.
CONCLUSIONS—The study shows a high occurrence of malignant melanoma among pilots. It is open to discussion what role exposure of cosmic radiation, numbers of block-hours flown, or lifestyle factors—such as possible excessive sunbathing—play in the aetiology of cancer among pilots. This calls for further and more powerful studies. The excess of malignant melanoma among those flying over five time zones suggests that the importance of disturbance of the circadian rhythm should be taken into consideration in future studies.


Keywords: cancer registry; malignant melanoma of the skin; cosmic radiation; block-hours; time zone
    corecore