1,367 research outputs found

    Multi-dimensional Point Process Models in R

    Get PDF
    A software package for fitting and assessing multi-dimensional point process models using the R sta- tistical computing environment is described. Methods of residual analysis based on random thinning are discussed and implemented. Features of the software are demonstrated using data on wildfire occurrences in Northern Los Angeles County, California

    Caching and Visualizing Statistical Analyses

    Get PDF
    We present the cacher and CodeDepends packages for R, which provide tools for (1) caching and analyzing the code for statistical analyses and (2) distributing these analyses to others in an efficient manner over the web. The cacher package takes objects created by evaluating R expressions and stores them in key-value databases. These databases of cached objects can subsequently be assembled into “cache packages” for distribution over the web. The cacher package also provides tools to help readers examine the data and code in a statistical analysis and reproduce, modify, or improve upon the results. In addition, readers can easily conduct alternate analyses of the data. The CodeDepends package provides complementary tools for analyzing and visualizing the code for a statistical analysis and this functionality has been integrated into the cacher package. In this chapter we describe the cacher and CodeDepends packages and provide examples of how they can be used for reproducible research

    Spatial Misalignment in time series studies of air pollution and health data

    Get PDF
    Time series studies of environmental exposures often involve comparing daily changes in a toxicant measured at a point in space with daily changes in an aggregate measure of health. Spatial misalignment of the exposure and response variables can bias the estimation of health risk and the magnitude of this bias depends on the spatial variation of the exposure of interest. In air pollution epidemiology, there is an increasing focus on estimating the health effects of the chemical components of particulate matter. One issue that is raised by this new focus is the spatial misalignment error introduced by the lack of spatial homogeneity in many of the particulate matter components. Current approaches to estimating short-term health risks via time series modeling do not take into account the spatial properties of the chemical components and therefore could result in biased estimation of those risks. We present a spatial-temporal statistical model for quantifying spatial misalignment error and show how adjusted heath risk estimates can be obtained using a regression calibration approach and a two-stage Bayesian model. We apply our methods to a database containing information on hospital admissions, air pollution, and weather for 20 large urban counties in the United States

    Reduced Bayesian Hierarchical Models: Estimating Health Effects of Simultaneous Exposure to Multiple Pollutants

    Get PDF
    Quantifying the health effects associated with simultaneous exposure to many air pollutants is now a research priority of the US EPA. Bayesian hierarchical models (BHM) have been extensively used in multisite time series studies of air pollution and health to estimate health effects of a single pollutant adjusted for potential confounding of other pollutants and other time-varying factors. However, when the scientific goal is to estimate the impacts of many pollutants jointly, a straightforward application of BHM is challenged by the need to specify a random-effect distribution on a high-dimensional vector of nuisance parameters, which often do not have an easy interpretation. In this paper we introduce a new BHM formulation, which we call reduced BHM , aimed at analyzing clustered data sets in the presence of a large number of random effects that are not of primary scientific interest. At the first stage of the reduced BHM, we calculate the integrated likelihood of the parameter of interest (e.g. excess number of deaths attributed to simultaneous exposure to high levels of many pollutants). At the second stage, we specify a flexible random-effect distribution directly on the parameter of interest. The reduced BHM overcomes many of the challenges in the specification and implementation of full BHM in the context of a large number of nuisance parameters. In simulation studies we show that the reduced BHM performs comparably to the full BHM in many scenarios, and even performs better in some cases. Methods are applied to estimate location-specific and overall relative risks of cardiovascular hospital admissions associated with simultaneous exposure to elevated levels of particulate matter and ozone in 51 US counties during the period 1999-2005

    Bayesian Model Averaging for Clustered Data: Imputing Missing Daily Air Pollution Concentration

    Get PDF
    The presence of missing observations is a challenge in statistical analysis especially when data are clustered. In this paper, we develop a Bayesian model averaging (BMA) approach for imputing missing observations in clustered data. Our approach extends BMA by allowing the weights of competing regression models for missing data imputation to vary between clusters while borrowing information across clusters in estimating model parameters. Through simulation and cross-validation studies, we demonstrate that our approach outperforms the standard BMA imputation approach where model weights are assumed to be the same for all clusters. We then apply our proposed method to a national dataset of daily ambient coarse particulate matter (PM10-2.5) concentration between 2003 and 2005. We impute missing daily monitor-level PM10-2.5 measurements and estimate the posterior probability of PM10-2.5 nonattainment status for 95 US counties based on the Environmental Protection Agency\u27s proposed 24-hour standard
    corecore