28 research outputs found
Identification of novel risk factors for community-acquired Clostridium difficile infection using spatial statistics and geographic information system analyses
Background: The rate of community-acquired Clostridium difficile infection (CA-CDI) is increasing. While receipt of antibiotics remains an important risk factor for CDI, studies related to acquisition of C. difficile outside of hospitals are lacking. As a result, risk factors for exposure to C. difficile in community settings have been inadequately studied. Main objective: To identify novel environmental risk factors for CA-CDI Methods: We performed a population-based retrospective cohort study of patients with CA-CDI from 1/1/2007 through 12/31/2014 in a 10-county area in central North Carolina. 360 Census Tracts in these 10 counties were used as the demographic Geographic Information System (GIS) base-map. Longitude and latitude (X, Y) coordinates were generated from patient home addresses and overlaid to Census Tracts polygons using ArcGIS; ArcView was used to assess "hot-spots" or clusters of CA-CDI. We then constructed a mixed hierarchical model to identify environmental variables independently associated with increased rates of CA-CDI. Results: A total of 1,895 unique patients met our criteria for CA-CDI. The mean patient age was 54.5 years; 62% were female and 70% were Caucasian. 402 (21%) patient addresses were located in "hot spots" or clusters of CA-CDI (p<0.001). "Hot spot" census tracts were scattered throughout the 10 counties. After adjusting for clustering and population density, age ≥ 60 years (p = 0.03), race (<0.001), proximity to a livestock farm (0.01), proximity to farming raw materials services (0.02), and proximity to a nursing home (0.04) were independently associated with increased rates of CA-CDI. Conclusions: Our study is the first to use spatial statistics and mixed models to identify important environmental risk factors for acquisition of C. difficile and adds to the growing evidence that farm practices may put patients at risk for important drug-resistant infections
Differential Expression Analysis of Complex RNA-seq Experiments Using edgeR ∗
This article reviews the statistical theory underlying the edgeR software package for differential expression of RNA-seq data. Negative binomial models are used to capture the quadratic mean-variance relationship that can be observed in RNA-seq data. Conditional likelihood methods are used to avoid bias when estimating the level of variation. Empirical Bayes methods are used to allow gene-specific variation estimates even when the number of replicate samples is very small. Generalized linear models are used to accommodate arbitrarily complex designs. A key feature of the edgeR package is the use of weighted likelihood methods to implement a flexible empirical Bayes approach in the absence of easily tractable sampling distributions. The methodology is implemented in flexible software that is easy to use even for users who are not professional statisticians or bioinformaticians. The software is part of the Bioconductor project. This article describes some recently implemented features. Loess-style weighting is used to improve the weighted likelihood approach, and an analogy with quasilikelihood is used to estimate the optimal weight to be given to the empirical Bayes prior. The article includes a fully worked case study with complete code.