38 research outputs found
An analytic framework for exploring sampling and observation process biases in genome and phenome‐wide association studies using electronic health records
Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/155932/1/sim8524.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/155932/2/SIM8524-sup-0001-supinfo.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/155932/3/sim8524_am.pd
Patient Recruitment Using Electronic Health Records Under Selection Bias: a Two-phase Sampling Framework
Electronic health records (EHRs) are increasingly recognized as a
cost-effective resource for patient recruitment in clinical research. However,
how to optimally select a cohort from millions of individuals to answer a
scientific question of interest remains unclear. Consider a study to estimate
the mean or mean difference of an expensive outcome. Inexpensive auxiliary
covariates predictive of the outcome may often be available in patients' health
records, presenting an opportunity to recruit patients selectively which may
improve efficiency in downstream analyses. In this paper, we propose a
two-phase sampling design that leverages available information on auxiliary
covariates in EHR data. A key challenge in using EHR data for multi-phase
sampling is the potential selection bias, because EHR data are not necessarily
representative of the target population. Extending existing literature on
two-phase sampling design, we derive an optimal two-phase sampling method that
improves efficiency over random sampling while accounting for the potential
selection bias in EHR data. We demonstrate the efficiency gain from our
sampling design via simulation studies and an application to evaluating the
prevalence of hypertension among US adults leveraging data from the Michigan
Genomics Initiative, a longitudinal biorepository in Michigan Medicine
Multiple imputation of missing covariates for the Cox proportional hazards cure model
Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/134146/1/sim7048_am.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/134146/2/sim7048.pd
Empirical Validation of a New Data Product from the Interstellar Boundary Explorer Satellite
Since 2008, the Interstellar Boundary Explorer (IBEX) satellite has been
gathering data on heliospheric energetic neutral atoms (ENAs) while being
exposed to various sources of background noise, such as cosmic rays and solar
energetic particles. The IBEX mission initially released only a qualified
triple-coincidence (qABC) data product, which was designed to provide
observations of ENAs free of background contamination. Further measurements
revealed that the qABC data was in fact susceptible to contamination, having
relatively low ENA counts and high background rates. Recently, the mission team
considered releasing a certain qualified double-coincidence (qBC) data product,
which has roughly twice the detection rate of the qABC data product. This paper
presents a simulation-based validation of the new qBC data product against the
already-released qABC data product. The results show that the qBCs can
plausibly be said to share the same signal rate as the qABCs up to an average
absolute deviation of 3.6%. Visual diagnostics at an orbit, map, and full
mission level provide additional confirmation of signal rate coherence across
data products. These approaches are generalizable to other scenarios in which
one wishes to test whether multiple observations could plausibly be generated
by some underlying shared signal
Individualized outcome prognostication for patients with laryngeal cancer
Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/142424/1/cncr31087.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/142424/2/cncr31087_am.pd
Predictors of survival after total laryngectomy for recurrent/persistent laryngeal squamous cell carcinoma
BackgroundTotal laryngectomy remains the treatment of choice for recurrent/persistent laryngeal squamous cell carcinoma (SCC) after radiotherapy (RT) or chemoradiotherapy (CRT). However, despite attempts at aggressive surgical salvage, survival in this cohort remains suboptimal.MethodsA prospectively maintained single‐institution database was queried for patients undergoing total laryngectomy for recurrent/persistent laryngeal SCC after initial RT/CRT between 1998 and 2015(n = 244). Demographic, clinical, and survival data were abstracted. The Kaplan‐Meier survival curves and hazard ratios (HRs) were calculated.ResultsFive‐year overall survival (OS) was 49%. Five‐year disease‐free survival (DFS) was 58%. Independent predictors of OS included severe comorbidity (Adult Comorbidity Evaluation‐27 [ACE‐27] scale; HR 3.76; 95% confidence interval [CI] 1.56‐9.06), and positive recurrent clinical nodes (HR 2.91; 95% CI 1.74‐4.88).ConclusionSevere comorbidity status is the strongest predictor of OS, suggesting that increased attention to mitigating competing risks to health is critical. These data may inform a risk prediction model to allow for focused shared decision making, preoperative health optimization, and patient selection for adjuvant therapies.Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/139972/1/hed24918.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/139972/2/hed24918_am.pd
Germline variants and breast cancer survival in patients with distant metastases at primary breast cancer diagnosis.
Breast cancer metastasis accounts for most of the deaths from breast cancer. Identification of germline variants associated with survival in aggressive types of breast cancer may inform understanding of breast cancer progression and assist treatment. In this analysis, we studied the associations between germline variants and breast cancer survival for patients with distant metastases at primary breast cancer diagnosis. We used data from the Breast Cancer Association Consortium (BCAC) including 1062 women of European ancestry with metastatic breast cancer, 606 of whom died of breast cancer. We identified two germline variants on chromosome 1, rs138569520 and rs146023652, significantly associated with breast cancer-specific survival (P = 3.19 × 10-8 and 4.42 × 10-8). In silico analysis suggested a potential regulatory effect of the variants on the nearby target genes SDE2 and H3F3A. However, the variants showed no evidence of association in a smaller replication dataset. The validation dataset was obtained from the SNPs to Risk of Metastasis (StoRM) study and included 293 patients with metastatic primary breast cancer at diagnosis. Ultimately, larger replication studies are needed to confirm the identified associations