29 research outputs found

    Special Topics in Latent Variable Models with Spatially and Temporally Correlated Latent Variables

    Get PDF
    The term latent variable model (LVM) refers to any statistical procedure that utilizes information contained in a set of observed variables to construct a set of underlying latent variables that drive the observed values and associations. Independent component analysis (ICA) is a LVM that separates recorded mixtures of signals into independent source signals, called independent components (ICs). ICA is popular tool for separating brain signals of interest from artifacts and noise in electroencephalogram (EEG) data. Due to challenges in the estimation of uncertainties in ICA, standard errors are not generally estimated alongside ICA estimates and thus ICs representing brain signals of interest cannot be distinguished through a statistical hypothesis testing framework. In Chapter 2 of this dissertation, we propose a bootstrapping algorithm for ICA that produces bootstrap samples that retain critical correlation structures in the data. These are used to compute uncertainties for ICA parameter estimates and to construct a hypothesis test to identify ICs representing brain activity, which we demonstrate in the context of EEG functional connectivity. In Chapter 3, we extend this bootstrapping approach to accommodate pre-ICA dimension reduction procedures, and we use the resulting method to compare popular strategies for pre-ICA dimension reduction in EEG research. In the final chapter, we turn our attention to another LVM, factor analysis, which utilizes the covariance structure of a set of correlated observed variables to model a smaller number of unmeasured underlying variables. A spatial factor analysis (SFA) model can be used to quantify the social vulnerability of communities based on a set of observed social variables. Current SFA methodology is ill-equipped to handle spatial misalignment in the observed variables. We propose a joint spatial factor analysis model that identifies a common set of latent variables underlying spatially misaligned observed variables and produces results at the level of the smallest spatial units, thereby minimizing loss of information. We apply this model to spatially misaligned data to construct an index of community social vulnerability for Louisiana, which we integrate with Louisiana flood data to identify communities at high risk during natural disasters, based on both social and geographic features.Doctor of Philosoph

    Causal exposure-response curve estimation with surrogate confounders: a study of air pollution and children's health in Medicaid claims data

    Full text link
    In this paper, we undertake a case study in which interest lies in estimating a causal exposure-response function (ERF) for long-term exposure to fine particulate matter (PM2.5_{2.5}) and respiratory hospitalizations in socioeconomically disadvantaged children using nationwide Medicaid claims data. New methods are needed to address the specific challenges the Medicaid data present. First, Medicaid eligibility criteria, which are largely based on family income for children, differ by state, creating socioeconomically distinct populations and leading to clustered data, where zip codes (our units of analysis) are nested within states. Second, Medicaid enrollees' individual-level socioeconomic status, which is known to be a confounder and an effect modifier of the exposure-response relationships under study, is not available. However, two useful surrogates are available: median household income of each enrollee's zip code of residence and state-level Medicaid family income eligibility thresholds for children. In this paper, we introduce a customized approach, called \textit{MedMatch}, that builds on generalized propensity score matching methods for estimating causal ERFs, adapting these approaches to leverage our two surrogate variables to account for potential confounding and/or effect modification by socioeconomic status. We conduct extensive simulation studies, consistently demonstrating the strong performance of \textit{MedMatch} relative to conventional approaches to handling the surrogate variables. We apply \textit{MedMatch} to estimate the causal ERF between long-term PM2.5_{2.5} exposure and first respiratory hospitalization among children in Medicaid from 2000 to 2012. We find a positive association, with a steeper curve at PM2.5≤8_{2.5} \le 8 μ\mug/m3^3 that levels off at higher concentrations.Comment: 38 pages,5 figure

    Estimating a Causal Exposure Response Function with a Continuous Error-Prone Exposure: A Study of Fine Particulate Matter and All-Cause Mortality

    Full text link
    Numerous studies have examined the associations between long-term exposure to fine particulate matter (PM2.5) and adverse health outcomes. Recently, many of these studies have begun to employ high-resolution predicted PM2.5 concentrations, which are subject to measurement error. Previous approaches for exposure measurement error correction have either been applied in non-causal settings or have only considered a categorical exposure. Moreover, most procedures have failed to account for uncertainty induced by error correction when fitting an exposure-response function (ERF). To remedy these deficiencies, we develop a multiple imputation framework that combines regression calibration and Bayesian techniques to estimate a causal ERF. We demonstrate how the output of the measurement error correction steps can be seamlessly integrated into a Bayesian additive regression trees (BART) estimator of the causal ERF. We also demonstrate how locally-weighted smoothing of the posterior samples from BART can be used to create a more accurate ERF estimate. Our proposed approach also properly propagates the exposure measurement error uncertainty to yield accurate standard error estimates. We assess the robustness of our proposed approach in an extensive simulation study. We then apply our methodology to estimate the effects of PM2.5 on all-cause mortality among Medicare enrollees in New England from 2000-2012

    Severe flooding and cause-specific hospitalization in the United States

    Full text link
    Flooding is one of the most disruptive and costliest climate-related disasters and presents an escalating threat to population health due to climate change and urbanization patterns. Previous studies have investigated the consequences of flood exposures on only a handful of health outcomes and focus on a single flood event or affected region. To address this gap, we conducted a nationwide, multi-decade analysis of the impacts of severe floods on a wide range of health outcomes in the United States by linking a novel satellite-based high-resolution flood exposure database with Medicare cause-specific hospitalization records over the period 2000- 2016. Using a self-matched study design with a distributed lag model, we examined how cause-specific hospitalization rates deviate from expected rates during and up to four weeks after severe flood exposure. Our results revealed that risk of hospitalization was consistently elevated during and for at least four weeks following severe flood exposure for nervous system diseases (3.5 %; 95 % confidence interval [CI]: 0.6 %, 6.4 %), skin and subcutaneous tissue diseases (3.4 %; 95 % CI: 0.3 %, 6.7 %), and injury and poisoning (1.5 %; 95 % CI: -0.07 %, 3.2 %). Increases in hospitalization rate for these causes, musculoskeletal system diseases, and mental health-related impacts varied based on proportion of Black residents in each ZIP Code. Our findings demonstrate the need for targeted preparedness strategies for hospital personnel before, during, and after severe flooding

    Impacts of Census Differential Privacy for Small-Area Disease Mapping to Monitor Health Inequities

    Full text link
    The US Census Bureau will implement a new privacy-preserving disclosure avoidance system (DAS), which includes application of differential privacy, on publicly-released 2020 census data. There are concerns that the DAS may bias small-area and demographically-stratified population counts, which play a critical role in public health research, serving as denominators in estimation of disease/mortality rates. Employing three DAS demonstration products, we quantify errors attributable to reliance on DAS-protected denominators in standard small-area disease mapping models for characterizing health inequities. We conduct simulation studies and real data analyses of inequities in premature mortality at the census tract level in Massachusetts and Georgia. Results show that overall patterns of inequity by racialized group and economic deprivation level are not compromised by the DAS. While early versions of DAS induce errors in mortality rate estimation that are larger for Black than non-Hispanic white populations in Massachusetts, this issue is ameliorated in newer DAS versions

    A common spatial factor analysis model for measured neighborhood-level characteristics: The Multi-Ethnic Study of Atherosclerosis

    Get PDF
    The purpose of this study was to reduce the dimensionality of a set of neighborhood-level variables collected on participants in the Multi-Ethnic Study of Atherosclerosis (MESA) while appropriately accounting for the spatial structure of the data. A common spatial factor analysis model in the Bayesian setting was utilized in order to properly characterize dependencies in the data. Results suggest that use of the spatial factor model can result in more precise estimation of factor scores, improved insight into the spatial patterns in the data, and the ability to more accurately assess associations between the neighborhood environment and health outcomes
    corecore