4,283 research outputs found
Adjusting for Confounding by Neighborhood Using a Proportional Odds Model and Complex Survey Data
In social epidemiology, an individual\u27s neighborhood is considered to be an important determinant of health behaviors, mediators, and outcomes. Consequently, when investigating health disparities, researchers may wish to adjust for confounding by unmeasured neighborhood factors, such as local availability of health facilities or cultural predispositions. With a simple random sample and a binary outcome, a conditional logistic regression analysis that treats individuals within a neighborhood as a matched set is a natural method to use. The authors present a generalization of this method for ordinal outcomes and complex sampling designs. The method is based on a proportional odds model and is very simple to program using standard software such as SAS PROC SURVEYLOGISTIC (SAS Institute Inc., Cary, North Carolina). The authors applied the method to analyze racial/ethnic differences in dental preventative care, using 2008 Florida Behavioral Risk Factor Surveillance System survey data. The ordinal outcome represented time since last dental cleaning, and the authors adjusted for individual-level confounding by gender, age, education, and health insurance coverage. The authors compared results with and without additional adjustment for confounding by neighborhood, operationalized as zip code. The authors found that adjustment for confounding by neighborhood greatly affected the results in this example
A new approach to hierarchical data analysis: Targeted maximum likelihood estimation for the causal effect of a cluster-level exposure
We often seek to estimate the impact of an exposure naturally occurring or
randomly assigned at the cluster-level. For example, the literature on
neighborhood determinants of health continues to grow. Likewise, community
randomized trials are applied to learn about real-world implementation,
sustainability, and population effects of interventions with proven
individual-level efficacy. In these settings, individual-level outcomes are
correlated due to shared cluster-level factors, including the exposure, as well
as social or biological interactions between individuals. To flexibly and
efficiently estimate the effect of a cluster-level exposure, we present two
targeted maximum likelihood estimators (TMLEs). The first TMLE is developed
under a non-parametric causal model, which allows for arbitrary interactions
between individuals within a cluster. These interactions include direct
transmission of the outcome (i.e. contagion) and influence of one individual's
covariates on another's outcome (i.e. covariate interference). The second TMLE
is developed under a causal sub-model assuming the cluster-level and
individual-specific covariates are sufficient to control for confounding.
Simulations compare the alternative estimators and illustrate the potential
gains from pairing individual-level risk factors and outcomes during
estimation, while avoiding unwarranted assumptions. Our results suggest that
estimation under the sub-model can result in bias and misleading inference in
an observational setting. Incorporating working assumptions during estimation
is more robust than assuming they hold in the underlying causal model. We
illustrate our approach with an application to HIV prevention and treatment
Bayesian nonparametric models for spatially indexed data of mixed type
We develop Bayesian nonparametric models for spatially indexed data of mixed
type. Our work is motivated by challenges that occur in environmental
epidemiology, where the usual presence of several confounding variables that
exhibit complex interactions and high correlations makes it difficult to
estimate and understand the effects of risk factors on health outcomes of
interest. The modeling approach we adopt assumes that responses and confounding
variables are manifestations of continuous latent variables, and uses
multivariate Gaussians to jointly model these. Responses and confounding
variables are not treated equally as relevant parameters of the distributions
of the responses only are modeled in terms of explanatory variables or risk
factors. Spatial dependence is introduced by allowing the weights of the
nonparametric process priors to be location specific, obtained as probit
transformations of Gaussian Markov random fields. Confounding variables and
spatial configuration have a similar role in the model, in that they only
influence, along with the responses, the allocation probabilities of the areas
into the mixture components, thereby allowing for flexible adjustment of the
effects of observed confounders, while allowing for the possibility of residual
spatial structure, possibly occurring due to unmeasured or undiscovered
spatially varying factors. Aspects of the model are illustrated in simulation
studies and an application to a real data set
A Primer on Causality in Data Science
Many questions in Data Science are fundamentally causal in that our objective
is to learn the effect of some exposure, randomized or not, on an outcome
interest. Even studies that are seemingly non-causal, such as those with the
goal of prediction or prevalence estimation, have causal elements, including
differential censoring or measurement. As a result, we, as Data Scientists,
need to consider the underlying causal mechanisms that gave rise to the data,
rather than simply the pattern or association observed in those data. In this
work, we review the 'Causal Roadmap' of Petersen and van der Laan (2014) to
provide an introduction to some key concepts in causal inference. Similar to
other causal frameworks, the steps of the Roadmap include clearly stating the
scientific question, defining of the causal model, translating the scientific
question into a causal parameter, assessing the assumptions needed to express
the causal parameter as a statistical estimand, implementation of statistical
estimators including parametric and semi-parametric methods, and interpretation
of our findings. We believe that using such a framework in Data Science will
help to ensure that our statistical analyses are guided by the scientific
question driving our research, while avoiding over-interpreting our results. We
focus on the effect of an exposure occurring at a single time point and
highlight the use of targeted maximum likelihood estimation (TMLE) with Super
Learner.Comment: 26 pages (with references); 4 figure
On a preference‐based instrumental variable approach in reducing unmeasured confounding‐by‐indication
Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/110880/1/sim6404.pd
Evaluating Risks from Antibacterial Medication Therapy
ABSTRACT EVALUATING RISKS FROM ANTIBACTERIAL MEDICATION THERAPY USING AN OBSERVATIONAL PRIMARY CARE DATABASE Sharon B. Meropol Joshua P. Metlay Virtually everyone in the U.S. is exposed to antibacterial drugs at some point in their lives. It is important to understand the benefits and risks related to these medications with nearly universal public exposure. Most information on antibacterial drug-associated adverse events comes from spontaneous reports. Without an unexposed control group, it is impossible to know the real risks for treated vs. untreated patients. We used an electronic medical record database to select a cohort of office visits for non-bacterial acute respiratory tract infections (excluding patients with pneumonia, sinusitis, or acute exacerbations of chronic bronchitis), and compared outcomes of antibacterial drug-exposed vs. -unexposed patients. By limiting our assessment to visits with acute nonspecific respiratory infections, we promoted comparability between exposed and unexposed patients. To further control for confounding by indication and practice, we explored methods to promote further comparability between exposure groups. Our rare outcome presented an additional analytic challenge. Antibacterial drug prescribing for acute nonspecific respiratory infections decreased over the study period, but, in contrast to the U.S., broad spectrum antibacterial prescribing remained low. Conditional fixed effects linear regression provided stable estimates of exposure effects on rare outcomes; results were similar to those using more traditional methods for binary outcomes. Patients with acute nonspecific respiratory infections treated with antibacterial drugs were not at increased risk of severe adverse events compared to untreated patients. Patients with acute nonspecific respiratory infections exposed to antibacterials had a small decreased risk of pneumonia hospitalizations vs. unexposed patients. This very small measurable benefit of antibacterial drug therapy for acute nonspecific respiratory infections at the patient level must be weighed against the public health risk of emerging antibacterial resistance. Our data provide valuable point estimates of risks and benefits that can be used to inform future decision analysis and guideline recommendations for patients with acute nonspecific respiratory infections. Ultimately, improved point-of-care diagnostic testing may help direct antibacterial drugs to the subset of patients most likely to derive benefit
Comparing the estimates of effect obtained from statistical causal inference methods: An example using bovine respiratory disease in feedlot cattle
The causal effect of an exposure on an outcome of interest in an observational study cannot be estimated directly if the confounding variables are not controlled. Many approaches are available for estimating the causal effect of an exposure. In this manuscript, we demonstrate the advantages associated with using inverse probability weighting (IPW) and doubly robust estimation of the odds ratio in terms of reduced bias. IPW approach can be used to adjust for confounding variables and provide unbiased estimates of the exposure’s causal effect. For cluster-structured data, as is common in animal populations, inverse conditional probability weighting (ICPW) approach can provide a robust estimation of the causal effect. Doubly robust estimation can provide a robust method even when the specification of the model form is uncertain. In this paper, the usage of IPW, ICPW, and doubly robust approaches are illustrated with a subset of data with complete covariates from the Australian-based National Bovine Respiratory Disease Initiative as well as simulated data. We evaluate the causal effect of prior bovine viral diarrhea exposure on bovine respiratory disease in feedlot cattle. The results show that the IPW, ICPW and doubly robust approaches would provide a more accurate estimation of the exposure effect than the traditional outcome regression model, and doubly robust approaches are the most preferable overall
- …