69 research outputs found

    A new skew-elliptical distribution and its properties

    No full text
    This article generalizes a multivariate skew-elliptical distribution and describes its many interesting properties. The univariate version of the new distribution is compared with two other currently used distributions. The use of the new distribution is illustrated with a real data example suitable for regression modelling. The new model provides a better model fit than its two rivals as evaluated by some suitable Bayesian model selection criteria

    Spatio-Temporal Modelling and Forecasting of Fine Particulate Matter

    Get PDF
    Studies indicate that even short-term exposure to high concentrations of fine atmospheric particulate matter (PM2.5) can lead to long-term health effects. Data are typically observed at fixed monitoring stations throughout a study region of interest at different time points. The study region may contain both rural and urban areas. Statistical spatio-temporal models are appropriate for modelling these data.In this talk I will summarise my recent work on modelling and short-term forecasting of PM2.5 levels. I will talk about a a random effects model developed in Sahu et al. (2004) and briefly mention a Bayesian Kriged-Kalman filtering model detailed in Sahu and Mardia (2005). In the first approach we introduce two random effects components, one for rural or background levels and the other as a supplement for urban areas. These are specified in the form of spatio-temporal processes. Weighting these processes through population density results in nonstationarity in space. In the talk I will analyze a dataset on observed PM2.5 in three states in the U.S. - Illinois, Indiana and Ohio

    A rigorous statistical framework for spatio-temporal pollution prediction and estimation of its long-term impact on health

    Get PDF
    In the United Kingdom, air pollution is linked to around 40000 premature deaths each year, but estimating its health effects is challenging in a spatio-temporal study. The challenges include spatial misalignment between the pollution and disease data; uncertainty in the estimated pollution surface; and complex residual spatio-temporal autocorrelation in the disease data. This article develops a two-stage model that addresses these issues. The first stage is a spatio-temporal fusion model linking modeled and measured pollution data, while the second stage links these predictions to the disease data. The methodology is motivated by a new five-year study investigating the effects of multiple pollutants on respiratory hospitalizations in England between 2007 and 2011, using pollution and disease data relating to local and unitary authorities on a monthly time scale

    Recent Trends in Modelling Spatio-Temporal Data

    Get PDF
    Il lavoro fornisce una disamina delle pi`u recenti metodologie proposte nellñ€Âℱambito dei modelli spazio-temporali. Nel tentativo di proporre una visione unificata delle metodologie trattate, viene fornita prima una descrizione dei vari tipi di dati spazio-temporali. Successivamente, si procede con la discussione dei modelli per processi spazialmente continui. La modellistica spazio-temporale `e stata largamente utilizzata per affrontare problemi in ambito ambientale, geostatistico, idrologico e meteorologico. Questo articolo fornisce una analisi dei metodi correntemente applicati in molte di queste aree

    A Bayesian localized conditional autoregressive model for estimating the health effects of air pollution

    Get PDF
    Estimation of the long-term health effects of air pollution is a challenging task, especially when modeling spatial small-area disease incidence data in an ecological study design. The challenge comes from the unobserved underlying spatial autocorrelation structure in these data, which is accounted for using random effects modeled by a globally smooth conditional autoregressive model. These smooth random effects confound the effects of air pollution, which are also globally smooth. To avoid this collinearity a Bayesian localized conditional autoregressive model is developed for the random effects. This localized model is flexible spatially, in the sense that it is not only able to model areas of spatial smoothness, but also it is able to capture step changes in the random effects surface. This methodological development allows us to improve the estimation performance of the covariate effects, compared to using traditional conditional auto-regressive models. These results are established using a simulation study, and are then illustrated with our motivating study on air pollution and respiratory ill health in Greater Glasgow, Scotland in 2011. The model shows substantial health effects of particulate matter air pollution and nitrogen dioxide, whose effects have been consistently attenuated by the currently available globally smooth models

    A Bayesian Perspective of Statistical Machine Learning for Big Data

    Get PDF
    Statistical Machine Learning (SML) refers to a body of algorithms and methods by which computers are allowed to discover important features of input data sets which are often very large in size. The very task of feature discovery from data is essentially the meaning of the keyword `learning' in SML. Theoretical justifications for the effectiveness of the SML algorithms are underpinned by sound principles from different disciplines, such as Computer Science and Statistics. The theoretical underpinnings particularly justified by statistical inference methods are together termed as statistical learning theory. This paper provides a review of SML from a Bayesian decision theoretic point of view -- where we argue that many SML techniques are closely connected to making inference by using the so called Bayesian paradigm. We discuss many important SML techniques such as supervised and unsupervised learning, deep learning, online learning and Gaussian processes especially in the context of very large data sets where these are often employed. We present a dictionary which maps the key concepts of SML from Computer Science and Statistics. We illustrate the SML techniques with three moderately large data sets where we also discuss many practical implementation issues. Thus the review is especially targeted at statisticians and computer scientists who are aspiring to understand and apply SML for moderately large to big data sets.Comment: 26 pages, 3 figures, Review pape

    Regional surface chlorophyll trends and uncertainties in the global ocean

    Get PDF
    Changes in marine primary productivity are key to determine how climate change might impact marine ecosystems and fisheries. Satellite ocean color sensors provide coverage of global ocean chlorophyll with a combined record length of ~ 20 years. Coupled physical–biogeochemical models can inform on expected changes and are used here to constrain observational trend estimates and their uncertainty. We produce estimates of ocean surface chlorophyll trends, by using Coupled Model Intercomparison Project (CMIP5) models to form priors as a “first guess”, which are then updated using satellite observations in a Bayesian spatio-temporal model. Regional chlorophyll trends are found to be significantly different from zero in 18/23 regions, in the range ± 1.8% year−1. A global average of these regional trends shows a net positive trend of 0.08 ± 0.35% year−1, highlighting the importance of considering chlorophyll changes at a regional level. We compare these results with estimates obtained with the commonly used “vague” prior, representing no independent knowledge; coupled model priors are shown to slightly reduce trend magnitude and uncertainties in most regions. The statistical model used here provides a robust framework for making best use of all available information and can be applied to improve understanding of global change

    Bayesian hierarchical modelling approaches for combining information from multiple data sources to produce annual estimates of national immunization coverage

    Full text link
    Estimates of national immunization coverage are crucial for guiding policy and decision-making in national immunization programs and setting the global immunization agenda. WHO and UNICEF estimates of national immunization coverage (WUENIC) are produced annually for various vaccine-dose combinations and all WHO Member States using information from multiple data sources and a deterministic computational logic approach. This approach, however, is incapable of characterizing the uncertainties inherent in coverage measurement and estimation. It also provides no statistically principled way of exploiting and accounting for the interdependence in immunization coverage data collected for multiple vaccines, countries and time points. Here, we develop Bayesian hierarchical modeling approaches for producing accurate estimates of national immunization coverage and their associated uncertainties. We propose and explore two candidate models: a balanced data single likelihood (BDSL) model and an irregular data multiple likelihood (IDML) model, both of which differ in their handling of missing data and characterization of the uncertainties associated with the multiple input data sources. We provide a simulation study that demonstrates a high degree of accuracy of the estimates produced by the proposed models, and which also shows that the IDML model is the better model. We apply the methodology to produce coverage estimates for select vaccine-dose combinations for the period 2000-2019. A contributed R package {\tt imcover} implementing the No-U-Turn Sampler (NUTS) in the Stan programming language enhances the utility and reproducibility of the methodology.Comment: 31 pages (main), 4 figure

    Assessing trends and uncertainties in satellite-era ocean chlorophyll using space-time modeling

    Get PDF
    The presence, magnitude, and even direction of long-term trends in phytoplankton abundance over the past few decades is still debated in the literature, primarily due to differences in the data sets and methodologies used. Recent work has suggested that the satellite chlorophyll record is not yet long enough to distinguish climate change trends from natural variability, despite the high density of coverage in both space and time. Previous work has typically focused on using linear models to determine the presence of trends, where each grid cell is considered independently from its neighbors. However, trends can be more thoroughly evaluated using a spatially resolved approach. Here a Bayesian hierarchical spatio-temporal model is fitted to quantify trends in ocean chlorophyll from September 1997 to December 2013. The approach used in this study explicitly accounts for the dependence between neighboring grid cells, which allows us to estimate trend by ‘borrowing strength’ from the spatial correlation. By way of comparison, a model without spatial correlation is also fitted. This results in a notable loss of accuracy in model fit. Additionally, we find an order of magnitude smaller global trend, and larger uncertainty, when using the spatio-temporal model: -0.023 ± 0.12 % yr-1 as opposed to -0.38 ± 0.045 % yr-1 when the spatial correlation is not taken into account. The improvement in accuracy of trend estimates, and the more complete account of their uncertainty emphasizes the solution that space-time modeling offers for studying global long-term change
