Search CORE

226 research outputs found

Reduced-rank spatio-temporal modeling of air pollution concentrations in the Multi-Ethnic Study of Atherosclerosis and Air Pollution

Author: Kaufman Joel D.
Lindström Johan
Olives Casey
Sampson Paul D.
Sheppard Lianne
Szpiro Adam A.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2014
Field of study

There is growing evidence in the epidemiologic literature of the relationship between air pollution and adverse health outcomes. Prediction of individual air pollution exposure in the Environmental Protection Agency (EPA) funded Multi-Ethnic Study of Atheroscelerosis and Air Pollution (MESA Air) study relies on a flexible spatio-temporal prediction model that integrates land-use regression with kriging to account for spatial dependence in pollutant concentrations. Temporal variability is captured using temporal trends estimated via modified singular value decomposition and temporally varying spatial residuals. This model utilizes monitoring data from existing regulatory networks and supplementary MESA Air monitoring data to predict concentrations for individual cohort members. In general, spatio-temporal models are limited in their efficacy for large data sets due to computational intractability. We develop reduced-rank versions of the MESA Air spatio-temporal model. To do so, we apply low-rank kriging to account for spatial variation in the mean process and discuss the limitations of this approach. As an alternative, we represent spatial variation using thin plate regression splines. We compare the performance of the outlined models using EPA and MESA Air monitoring data for predicting concentrations of oxides of nitrogen (NO

_x

)-a pollutant of primary interest in MESA Air-in the Los Angeles metropolitan area via cross-validated

R^2

. Our findings suggest that use of reduced-rank models can improve computational efficiency in certain cases. Low-rank kriging and thin plate regression splines were competitive across the formulations considered, although TPRS appeared to be more robust in some settings.Comment: Published in at http://dx.doi.org/10.1214/14-AOAS786 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Lund University Publications

PubMed Central

Measurement error in a multi-level analysis of air pollution and health: a simulation study.

Author: A Burton
A Gryparis
AA Szpiro
AA Szpiro
AA Szpiro
B Armstrong
Barbara K. Butland
Benjamin Barratt
BK Butland
D Bates
DR Cox
Evangelia Samoli
GK Reeves
GT Goldman
IC Mills
KL Dionisio
Klea Katsouyanni
MJ Strickland
P Armitage
Richard W. Atkinson
S-Y Kim
SE Alexeeff
SE Alexeeff
WN Venables
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/02/2019
Field of study

BACKGROUND: Spatio-temporal models are increasingly being used to predict exposure to ambient outdoor air pollution at high spatial resolution for inclusion in epidemiological analyses of air pollution and health. Measurement error in these predictions can nevertheless have impacts on health effect estimation. Using statistical simulation we aim to investigate the effects of such error within a multi-level model analysis of long and short-term pollutant exposure and health. METHODS: Our study was based on a theoretical sample of 1000 geographical sites within Greater London. Simulations of "true" site-specific daily mean and 5-year mean NO2 and PM10 concentrations, incorporating both temporal variation and spatial covariance, were informed by an analysis of daily measurements over the period 2009-2013 from fixed location urban background monitors in the London area. In the context of a multi-level single-pollutant Poisson regression analysis of mortality, we investigated scenarios in which we specified: the Pearson correlation between modelled and "true" data and the ratio of their variances (model versus "true") and assumed these parameters were the same spatially and temporally. RESULTS: In general, health effect estimates associated with both long and short-term exposure were biased towards the null with the level of bias increasing to over 60% as the correlation coefficient decreased from 0.9 to 0.5 and the variance ratio increased from 0.5 to 2. However, for a combination of high correlation (0.9) and small variance ratio (0.5) non-trivial bias (> 25%) away from the null was observed. Standard errors of health effect estimates, though unaffected by changes in the correlation coefficient, appeared to be attenuated for variance ratios > 1 but inflated for variance ratios < 1. CONCLUSION: While our findings suggest that in most cases modelling errors result in attenuation of the effect estimate towards the null, in some situations a non-trivial bias away from the null may occur. The magnitude and direction of bias appears to depend on the relationship between modelled and "true" data in terms of their correlation and the ratio of their variances. These factors should be taken into account when assessing the validity of modelled air pollution predictions for use in complex epidemiological models

Crossref

Directory of Open Access Journals

King's Research Portal

St George's Online Research Archive

Pragmatic Estimation of a Spatio-Temporal Air Quality Model With Irregular Monitoring Data

Author: Kaufman Joel D
Lindström Johan
Sampson Paul D
Sheppard Lianne
Szpiro Adam A
Publication venue: Collection of Biostatistics Research Archive
Publication date: 01/01/2009
Field of study

Statistical analyses of the health effects of air pollution have increasingly used GIS-based covariates for prediction of ambient air quality in “land-use” regression models. More recently these regression models have accounted for spatial correlation structure in combining monitoring data with land-use covariates. The current paper builds on these concepts to address spatio-temporal prediction of ambient concentrations of particulate matter with aerodynamic diameter less than 2.5 μm (PM2.5) on the basis of a model representing spatially varying seasonal trends and spatial correlation structures. Our hierarchical methodology provides a pragmatic approach that fully exploits regulatory and other supplemental monitoring data which jointly define a complex spatio-temporal monitoring design. We explain the elements of the computational approach, including estimation of smoothed empirical orthogonal functions (SEOFs) as basis functions for temporal trend, spatial (“land use”) regression by Partial Least Squares (PLS), modeling of spatio-temporal correlation structure, and generalized universal kriging prediction of ambient exposure for subjects in the Multi-Ethnic Study of Atherosclerosis and Air Pollution (MESA Air) project. Analyses are demonstrated in detail for the South California study area of the MESA Air project using AQS monitoring data from 2000 to 2006 and supplemental MESA Air monitoring data beginning in 2005. Results of application of the modeling and estimation methodology are presented also for five other MESA Air metropolitan study areas across the country with comments on current and future research developments

Lund University Publications

Collection Of Biostatistics Research Archive

Predicting Intra-Urban Variation in Air Pollution Concentrations with Complex Spatio-Temporal Interactions

Author: Adar Sara D
Kaufman Joel
Lumley Thomas
Sampson Paul D
Sheppard Lianne
Szpiro Adam A
Publication venue: Collection of Biostatistics Research Archive
Publication date: 20/11/2008
Field of study

We describe a methodology for assigning individual estimates of long-term average air pollution concentrations that accounts for a complex spatio-temporal correlation structure and can accommodate unbalanced observations. This methodology has been developed as part of the Multi-Ethnic Study of Atherosclerosis and Air Pollution (MESA Air), a prospective cohort study funded by the U.S. EPA to investigate the relationship between chronic exposure to air pollution and cardiovascular disease. Our hierarchical model decomposes the space-time field into a “mean” that includes dependence on covariates and spatially varying seasonal and long-term trends and a “residual” that accounts for spatially correlated deviations from the mean model. The model accommodates complex spatio-temporal patterns by characterizing the temporal trend at each location as a linear combination of empirically derived temporal basis functions, and embedding the spatial fields of coefficients for the basis functions in separate linear regression models with spatially correlated residuals (universal kriging). This approach allows us to implement a scalable single-stage estimation procedure that easily accommodates a significant number of missing observations at some monitoring locations. We apply the model to predict long-term average concentrations of oxides of nitrogen (NOx) from 2005-2007 in the Los Angeles area, based on data from 18 EPA Air Quality System regulatory monitors. The cross-validated R2 is 0.67. The MESA Air study is also collecting additional concentration data as part of a supplementary monitoring campaign. We describe the sampling plan and demonstrate in a simulation study that the additional data will contribute to improved predictions of long-term average concentrations

Collection Of Biostatistics Research Archive

A Flexible Spatio-Temporal Model for Air Pollution: Allowing for Spatio-Temporal Covariates

Author: Larson Tim
Lindstrom Johan
Oron Assaf
Richards Mark
Sampson Paul D
Sheppard Lianne
Szpiro Adam A
Publication venue: Collection of Biostatistics Research Archive
Publication date: 01/01/2011
Field of study

Given the increasing interest in the association between exposure to air pollution and adverse health outcomes, the development of models that provide accurate spatio-temporal predictions of air pollution concentrations at small spatial scales is of great importance when assessing potential health effects of air pollution. The methodology presented here has been developed as part of the Multi-Ethnic Study of Atherosclerosis and Air Pollution (MESA Air), a prospective cohort study funded by the US EPA to investigate the relationship between chronic exposure to air pollution and cardiovascular disease. We present a spatio-temporal framework that models and predicts ambient air pollution by combining data from several different monitoring networks with the output from deterministic air pollution model(s). The model can accommodate arbitrarily missing observations and allows for a complex spatio-temporal correlation structure. We apply the model to predict long-term average concentrations of gaseous oxides of nitrogen (NOx) ─ one of the primary pollutants of interest in the MESA Air study ─ during a ten year period in the Los Angeles area, based on measurements from the EPA Air Quality System and MESA Air monitoring. The measurements are augmented by a spatio-temporal covariate based on the output from a source dispersion model for traffic related air pollution (Caline3QHC) and the model is evaluated using cross-validation. The predictive ability of the model is good with cross-validated R2 of approximately 0.7 at subject sites. The incorporation of a dispersion model output into the overall prediction model was feasible, but the particular implementation of Caline3QHC used here did not improve predictions in a model that also includes road information. However, excluding the road information the inclusion of model output improves predictions and we find some evidence that the source dispersion model can replace road covariates. The model presented in this paper has been implemented in an R package, SpatioTemporal, which will be available on CRAN shortly

Lund University Publications

Collection Of Biostatistics Research Archive

Concentrations of criteria pollutants in the contiguous U.S., 1979 – 2015: Role of model parsimony in integrated empirical geographic regression

Author: Bechle Matthew
Hankey Steve
Kim Sun-Young
Marshall Julian D
Sheppard Elizabeth (Lianne) A
Szpiro Adam A
Publication venue: Collection of Biostatistics Research Archive
Publication date: 30/11/2018
Field of study

BACKGROUND: National- or regional-scale prediction models that estimate individual-level air pollution concentrations commonly include hundreds of geographic variables. However, these many variables may not be necessary and parsimonious approach including small numbers of variables may achieve sufficient prediction ability. This parsimonious approach can also be applied to most criteria pollutants. This approach will be powerful when generating publicly available datasets of model predictions that support research in environmental health and other fields. OBJECTIVES: We aim to (1) build annual-average integrated empirical geographic (IEG) regression models for the contiguous U.S. for six criteria pollutants, for all years with regulatory monitoring data during 1979 – 2015; (2) explore the impact of model parsimony on model performance by comparing the model performance depending on the numbers or variables offered into a model; and (3) provide publicly available model predictions. METHODS: We compute annual-average concentrations from regulatory monitoring data for PM10, PM2.5, NO2, SO2, CO, and ozone at all monitoring sites for 1979-2015. We also compute ~900 geographic characteristics at each location including measures of traffic, land use, and satellite-based estimates of air pollution and landcover. We then develop IEG models, employing universal kriging and summary factors estimated by partial least squares (PLS) of independent variables. For all pollutants and years, we compare three approaches for choosing variables to include in the model: (1) no variables (kriging only), (2) a limited number of variables chosen by forward selection, and (3) all variables. We evaluate model performance using 10-fold cross-validation (CV) using conventional randomly-selected and spatially-clustered test data. RESULTS: Models using 3 to 30 variables generally have the best performance across all pollutants and years (median R2 conventional [clustered] CV: 0.66 [0.47]) compared to models with no (0.37 [0]) or all variables (0.64 [0.27]). Using the best models mostly including 3-30 variables, we predicted annual-average concentrations of six criteria pollutants for all Census Blocks in the contiguous U.S. DISCUSSION: Our findings suggest that national prediction models can be built on only a small number (30 or fewer) of important variables and provide robust concentration estimates. Model estimates are freely available online

Collection Of Biostatistics Research Archive

Recommended from our members

Exposure measurement error in PM2.5 health effects studies: A pooled analysis of eight personal exposure validation studies

Author: Hong Biling
Kaufman Joel D
Kioumourtzoglou Marianthi-Anna
Laden Francine
Sheppard Lianne
Spiegelman Donna
Suh Helen
Szpiro Adam A
Williams Ronald
Yanosky Jeff D
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 11/03/2014
Field of study

Background: Exposure measurement error is a concern in long-term PM2.5 health studies using ambient concentrations as exposures. We assessed error magnitude by estimating calibration coefficients as the association between personal PM2.5 exposures from validation studies and typically available surrogate exposures. Methods: Daily personal and ambient PM2.5, and when available sulfate, measurements were compiled from nine cities, over 2 to 12 days. True exposure was defined as personal exposure to PM2.5 of ambient origin. Since PM2.5 of ambient origin could only be determined for five cities, personal exposure to total PM2.5 was also considered. Surrogate exposures were estimated as ambient PM2.5 at the nearest monitor or predicted outside subjects’ homes. We estimated calibration coefficients by regressing true on surrogate exposures in random effects models. Results: When monthly-averaged personal PM2.5 of ambient origin was used as the true exposure, calibration coefficients equaled 0.31 (95% CI:0.14, 0.47) for nearest monitor and 0.54 (95% CI:0.42, 0.65) for outdoor home predictions. Between-city heterogeneity was not found for outdoor home PM2.5 for either true exposure. Heterogeneity was significant for nearest monitor PM2.5, for both true exposures, but not after adjusting for city-average motor vehicle number for total personal PM2.5. Conclusions: Calibration coefficients were <1, consistent with previously reported chronic health risks using nearest monitor exposures being under-estimated when ambient concentrations are the exposure of interest. Calibration coefficients were closer to 1 for outdoor home predictions, likely reflecting less spatial error. Further research is needed to determine how our findings can be incorporated in future health studies

Harvard University - DASH

Forecasting confined spatiotemporal chaos with genetic algorithms

Author: Alberto Álvarez
B. I. Shraiman
B. J. Gluckman
Cristóbal López
Emilio Hernández-García
G. G. Szpiro
H. Kantz
J. D. Rodriguez
J. H. Holland
L. Ning
L. Sirovich
L. Sirovich
M. Meixner
P. Holmes
S. M. Zoldi
S. Zoldi
S. Ørstavik
U. Parlitz
U. Parlitz
V. M. Eguíluz
Publication venue: 'American Physical Society (APS)'
Publication date: 29/03/2000
Field of study

A technique to forecast spatiotemporal time series is presented. it uses a Proper Ortogonal or Karhunen-Lo\`{e}ve Decomposition to encode large spatiotemporal data sets in a few time-series, and Genetic Algorithms to efficiently extract dynamical rules from the data. The method works very well for confined systems displaying spatiotemporal chaos, as exemplified here by forecasting the evolution of the onedimensional complex Ginzburg-Landau equation in a finite domain.Comment: 4 pages, 5 figure

arXiv.org e-Print Archive

Crossref

Digital.CSIC

Modeling the Residential Infiltration of Outdoor PM2.5 in the Multi-Ethnic Study of Atherosclerosis and Air Pollution (MESA Air)

Author: Cynthia L. Curl
Ed Avol
Joel D. Kaufman
L.-J. Sally Liu
Lianne Sheppard
Martin Cohen
Ryan W. Allen
Sara D. Adar
Szpiro AA
Timothy Larson
Publication venue: National Institute of Environmental Health Sciences
Publication date
Field of study

Background: Epidemiologic studies of fine particulate matter [aerodynamic diameter ≤ 2.5 μm (PM2.5)] typically use outdoor concentrations as exposure surrogates. Failure to account for variation in residential infiltration efficiencies (Finf) will affect epidemiologic study results

Crossref

PubMed Central