9,466 research outputs found

    A rigorous statistical framework for spatio-temporal pollution prediction and estimation of its long-term impact on health

    Get PDF
    In the United Kingdom, air pollution is linked to around 40000 premature deaths each year, but estimating its health effects is challenging in a spatio-temporal study. The challenges include spatial misalignment between the pollution and disease data; uncertainty in the estimated pollution surface; and complex residual spatio-temporal autocorrelation in the disease data. This article develops a two-stage model that addresses these issues. The first stage is a spatio-temporal fusion model linking modeled and measured pollution data, while the second stage links these predictions to the disease data. The methodology is motivated by a new five-year study investigating the effects of multiple pollutants on respiratory hospitalizations in England between 2007 and 2011, using pollution and disease data relating to local and unitary authorities on a monthly time scale

    Multivariate space-time modelling of multiple air pollutants and their health effects accounting for exposure uncertainty

    Get PDF
    The long-term health effects of air pollution are often estimated using a spatio-temporal ecological areal unit study, but this design leads to the following statistical challenges: (1) how to estimate spatially representative pollution concentrations for each areal unit; (2) how to allow for the uncertainty in these estimated concentrations when estimating their health effects; and (3) how to simultaneously estimate the joint effects of multiple correlated pollutants. This article proposes a novel 2-stage Bayesian hierarchical model for addressing these 3 challenges, with inference based on Markov chain Monte Carlo simulation. The first stage is a multivariate spatio-temporal fusion model for predicting areal level average concentrations of multiple pollutants from both monitored and modelled pollution data. The second stage is a spatio-temporal model for estimating the health impact of multiple correlated pollutants simultaneously, which accounts for the uncertainty in the estimated pollution concentrations. The novel methodology is motivated by a new study of the impact of both particulate matter and nitrogen dioxide concentrations on respiratory hospital admissions in Scotland between 2007 and 2011, and the results suggest that both pollutants exhibit substantial and independent health effects

    An integrated Bayesian model for estimating the long-term health effects of air pollution by fusing modelled and measured pollution data: a case study of nitrogen dioxide concentrations in Scotland

    Get PDF
    The long-term health effects of air pollution can be estimated using a spatio-temporal ecological study, where the disease data are counts of hospital admissions from populations in small areal units at yearly intervals. Spatially representative pollution concentrations for each areal unit are typically estimated by applying Kriging to data from a sparse monitoring network, or by computing averages over grid level concentrations from an atmospheric dispersion model. We propose a novel fusion model for estimating spatially aggregated pollution concentrations using both the modelled and monitored data, and relate these concentrations to respiratory disease in a new study in Scotland between 2007 and 2011

    First CLADAG data mining prize : data mining for longitudinal data with different marketing campaigns

    Get PDF
    The CLAssification and Data Analysis Group (CLADAG) of the Italian Statistical Society recently organised a competition, the 'Young Researcher Data Mining Prize' sponsored by the SAS Institute. This paper was the winning entry and in it we detail our approach to the problem proposed and our results. The main methods used are linear regression, mixture models, Bayesian autoregressive and Bayesian dynamic models

    A Tutorial on Estimating Time-Varying Vector Autoregressive Models

    Get PDF
    Time series of individual subjects have become a common data type in psychological research. These data allow one to estimate models of within-subject dynamics, and thereby avoid the notorious problem of making within-subjects inferences from between-subjects data, and naturally address heterogeneity between subjects. A popular model for these data is the Vector Autoregressive (VAR) model, in which each variable is predicted as a linear function of all variables at previous time points. A key assumption of this model is that its parameters are constant (or stationary) across time. However, in many areas of psychological research time-varying parameters are plausible or even the subject of study. In this tutorial paper, we introduce methods to estimate time-varying VAR models based on splines and kernel-smoothing with/without regularization. We use simulations to evaluate the relative performance of all methods in scenarios typical in applied research, and discuss their strengths and weaknesses. Finally, we provide a step-by-step tutorial showing how to apply the discussed methods to an openly available time series of mood-related measurements

    A Functional Wavelet-Kernel Approach for Continuous-time Prediction

    Get PDF
    We consider the prediction problem of a continuous-time stochastic process on an entire time-interval in terms of its recent past. The approach we adopt is based on functional kernel nonparametric regression estimation techniques where observations are segments of the observed process considered as curves. These curves are assumed to lie within a space of possibly inhomogeneous functions, and the discretized times series dataset consists of a relatively small, compared to the number of segments, number of measurements made at regular times. We thus consider only the case where an asymptotically non-increasing number of measurements is available for each portion of the times series. We estimate conditional expectations using appropriate wavelet decompositions of the segmented sample paths. A notion of similarity, based on wavelet decompositions, is used in order to calibrate the prediction. Asymptotic properties when the number of segments grows to infinity are investigated under mild conditions, and a nonparametric resampling procedure is used to generate, in a flexible way, valid asymptotic pointwise confidence intervals for the predicted trajectories. We illustrate the usefulness of the proposed functional wavelet-kernel methodology in finite sample situations by means of three real-life datasets that were collected from different arenas

    Intraday forecasts of a volatility index: Functional time series methods with dynamic updating

    Full text link
    As a forward-looking measure of future equity market volatility, the VIX index has gained immense popularity in recent years to become a key measure of risk for market analysts and academics. We consider discrete reported intraday VIX tick values as realisations of a collection of curves observed sequentially on equally spaced and dense grids over time and utilise functional data analysis techniques to produce one-day-ahead forecasts of these curves. The proposed method facilitates the investigation of dynamic changes in the index over very short time intervals as showcased using the 15-second high-frequency VIX index values. With the help of dynamic updating techniques, our point and interval forecasts are shown to enjoy improved accuracy over conventional time series models.Comment: 29 pages, 5 figures, To appear at the Annals of Operations Researc
    corecore