Search CORE

25 research outputs found

Complex modeling with detailed temporal predictors does not improve health records-based suicide risk prediction

Author: Ahmedani Brian K
Beck Arne
Boggs Jennifer M
Coley R. Yates
Cruz Maricela
Dharmarajan Sai
Johnson Eric
Penfold Robert B
Rossom Rebecca C
Shortreed Susan M
Simon Greg E
Walker Rod L
Wellman Robert
Yaseen Zimri S
Ziebell Rebecca
Publication venue: Henry Ford Health Scholarly Commons
Publication date: 23/03/2023
Field of study

Suicide risk prediction models can identify individuals for targeted intervention. Discussions of transparency, explainability, and transportability in machine learning presume complex prediction models with many variables outperform simpler models. We compared random forest, artificial neural network, and ensemble models with 1500 temporally defined predictors to logistic regression models. Data from 25,800,888 mental health visits made by 3,081,420 individuals in 7 health systems were used to train and evaluate suicidal behavior prediction models. Model performance was compared across several measures. All models performed well (area under the receiver operating curve [AUC]: 0.794-0.858). Ensemble models performed best, but improvements over a regression model with 100 predictors were minimal (AUC improvements: 0.006-0.020). Results are consistent across performance metrics and subgroups defined by race, ethnicity, and sex. Our results suggest simpler parametric models, which are easier to implement as part of routine clinical practice, perform comparably to more complex machine learning methods

Henry Ford Health System Scholarly Commons

Prediction of overall survival for patients with metastatic castration-resistant prostate cancer : development of a prognostic model through a crowdsourced challenge with open clinical trial data

Author: Abdallah Kald
Abdallah Kald
Airola Antti
Airola Antti
Aittokallio Tero
Aittokallio Tero
Anghe Catalina
Ankerst Donna P
Azima Helia
Baertsch Robert
Ballester Pedro J
Bare Chris
Bare J Christopher
Bhandari Vinayak
Bot Brian M
Bot Brian M
Buchardt Ann-Sophie
Buturovic Ljubomir
Cao Da
Chalise Prabhakar
Chang Billy HW
Cho Junwoo
Chu Tzu-Ming
Coley R Yates
Conjeti Sailesh
Correia Sara
Costello James C
Costello James C
Dai Junqiang
Dai Ziwei
Dang Cuong C
Dargatz Philip
Delavarkhan Sam
Deng Detian
Dhanik Ankur
Du Yu
Dunbar Maria Bekker-Nielsen
Elangovan Aparna
Ellis Shellie
Elo Laura L
Espiritu Shadrielle M
Fan Fan
Farshi Ashkan B
Freitas Ana
Fridley Brooke
Friend Stephen
Friend Stephen
Fuchs Christiane
Gofer Eyal
Golinska Agnieszka K
Graw Stefan
Greiner Russ
Guan Yuanfang
Guinney Justin
Guinney Justin
Guo Jing
Gupta Pankaj
Guyer Anna I
Han Jiawei
Hansen Niels R
Hirvonen Outi
Huang Barbara
Huang Chao
Hwang Jinseub
Ibrahim Joseph G
Jayaswa Vivek
Jeon Jouhyun
Ji Zhicheng
Juvvadi Deekshith
Jyrkkiö Sirkku
Kanigel-Winner Kimberly
Katouzian Amin
Kazanov Marat D
Khan Suleiman A
Khan Suleiman A
Khayyer Shahin
Kim Dalho
Koestler Devin
Kokowicz Fernanda
Kondofersky Ivan
Krautenbacher Norbert
Krstajic Damjan
Kumar Luke
Kurz Christoph
Kyan Matthew
Laajala Teemu D
Laajala Teemu D
Laimighofer Michael
Lee Eunjee
Lesinski Wojciech
Li Miaozhu
Li Ye
Lian Qiuyu
Liang Xiaotao
Lim Minseong
Lin Henry
Lin Xihui
Lu Jing
Mahmoudian Mehrad
Manshaei Roozbeh
Meier Richard
Miljkovic Dejan
Mirtti Tuomas
Mirtti Tuomas
Mnich Krzysztof
Navab Nassir
Neto Elias C
Neto Elias Chaibub
Newton Yulia
Norman Thea
Norman Thea
Pahikkala Tapio
Pahikkala Tapio
Pal Subhabrata
Park Byeongju
Patel Jaykumar
Pathak Swetabh
Pattin Alejandrina
Peddinti Gopal
Peddinti Gopalacharyulu
Peng Jian
Petersen Anne H
Philip Robin
Piccolo Stephen R
Polewko-Klim Aneta
Pölsterl Sebastian
Rao Karthik
Ren Xiang
Rocha Miguel
Rudnicki Witold R.
Ryan Charles J
Ryan Charles J
Ryu Hyunnam
Sartor Oliver
Sartor Oliver
Scher Howard I
Scherb Hagen
Sehgal Raghav
Seyednasrollah Fatemeh
Shang Jingbo
Shao Bin
Shen Liji
Shen Liji
Sher Howard
Shiga Motoki
Sokolov Artem
Song Lei
Soule Howard
Soule Howard
Stolovitzky Gustavo
Stolovitzky Gustavo
Stuart Josh
Sun Ren
Sweeney Christopher J
Sweeney Christopher J
Söllner Julia F
Tahmasebi Nazanin
Tan Kar-Tong
Tomaziu Lisbeth
Usset Joseph
Vang Yeeleng S
Vega Roberto
Vieira Vitor
Wang David
Wang Difei
Wang Junmei
Wang Lichao
Wang Sheng
Wang Tao
Wang Tao
Wang Yue
Winner Kimberly Kanigel
Wolfinger Russ
Wong Chris
Wu Zhenke
Xiao Jinfeng
Xie Xiaohui
Xie Yang
Xie Yang
Xin Doris
Yang Hojin
Yu Nancy
Yu Thomas
Yu Thomas
Yu Xiang
Zahedi Sulmaz
Zanin Massimiliano
Zhang Chihao
Zhang Jingwen
Zhang Shihua
Zhang Yanchun
Zhou Fang Liz
Zhou Fang Liz
Zhu Hongtu
Zhu Shanfeng
Zhu Yuxin
Publication venue
Publication date: 01/01/2016
Field of study

Background Improvements to prognostic models in metastatic castration-resistant prostate cancer have the potential to augment clinical trial design and guide treatment strategies. In partnership with Project Data Sphere, a not-for-profit initiative allowing data from cancer clinical trials to be shared broadly with researchers, we designed an open-data, crowdsourced, DREAM (Dialogue for Reverse Engineering Assessments and Methods) challenge to not only identify a better prognostic model for prediction of survival in patients with metastatic castration-resistant prostate cancer but also engage a community of international data scientists to study this disease. Methods Data from the comparator arms of four phase 3 clinical trials in first-line metastatic castration-resistant prostate cancer were obtained from Project Data Sphere, comprising 476 patients treated with docetaxel and prednisone from the ASCENT2 trial, 526 patients treated with docetaxel, prednisone, and placebo in the MAINSAIL trial, 598 patients treated with docetaxel, prednisone or prednisolone, and placebo in the VENICE trial, and 470 patients treated with docetaxel and placebo in the ENTHUSE 33 trial. Datasets consisting of more than 150 clinical variables were curated centrally, including demographics, laboratory values, medical history, lesion sites, and previous treatments. Data from ASCENT2, MAINSAIL, and VENICE were released publicly to be used as training data to predict the outcome of interest-namely, overall survival. Clinical data were also released for ENTHUSE 33, but data for outcome variables (overall survival and event status) were hidden from the challenge participants so that ENTHUSE 33 could be used for independent validation. Methods were evaluated using the integrated time-dependent area under the curve (iAUC). The reference model, based on eight clinical variables and a penalised Cox proportional-hazards model, was used to compare method performance. Further validation was done using data from a fifth trial-ENTHUSE M1-in which 266 patients with metastatic castration-resistant prostate cancer were treated with placebo alone. Findings 50 independent methods were developed to predict overall survival and were evaluated through the DREAM challenge. The top performer was based on an ensemble of penalised Cox regression models (ePCR), which uniquely identified predictive interaction effects with immune biomarkers and markers of hepatic and renal function. Overall, ePCR outperformed all other methods (iAUC 0.791; Bayes factor >5) and surpassed the reference model (iAUC 0.743; Bayes factor >20). Both the ePCR model and reference models stratified patients in the ENTHUSE 33 trial into high-risk and low-risk groups with significantly different overall survival (ePCR: hazard ratio 3.32, 95% CI 2.39-4.62, p Interpretation Novel prognostic factors were delineated, and the assessment of 50 methods developed by independent international teams establishes a benchmark for development of methods in the future. The results of this effort show that data-sharing, when combined with a crowdsourced challenge, is a robust and powerful framework to develop new prognostic models in advanced prostate cancer.Peer reviewe

Universidade do Minho: RepositoriUM

Crossref

PubMed Central

VTT Research System

Publications at Bielefeld University

Helsingin yliopiston digitaalinen arkisto

External Validation of the eRADAR Risk Score for Detecting Undiagnosed Dementia in Two Real-World Healthcare Systems.

Author: Coley R Yates,
Publication venue
Publication date: 03/07/2023
Field of study

Ezid

Empirical evaluation of internal validation methods for prediction in large-scale clinical data with rare-event outcomes: a case study in suicide risk prediction

Author: Noah Simon
Qinqing Liao
R Yates Coley
Susan M. Shortreed
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/02/2023
Field of study

Abstract Background There is increasing interest in clinical prediction models for rare outcomes such as suicide, psychiatric hospitalizations, and opioid overdose. Accurate model validation is needed to guide model selection and decisions about whether and how prediction models should be used. Split-sample estimation and validation of clinical prediction models, in which data are divided into training and testing sets, may reduce predictive accuracy and precision of validation. Using all data for estimation and validation increases sample size for both procedures, but validation must account for overfitting, or optimism. Our study compared split-sample and entire-sample methods for estimating and validating a suicide prediction model. Methods We compared performance of random forest models estimated in a sample of 9,610,318 mental health visits (“entire-sample”) and in a 50% subset (“split-sample”) as evaluated in a prospective validation sample of 3,754,137 visits. We assessed optimism of three internal validation approaches: for the split-sample prediction model, validation in the held-out testing set and, for the entire-sample model, cross-validation and bootstrap optimism correction. Results The split-sample and entire-sample prediction models showed similar prospective performance; the area under the curve, AUC, and 95% confidence interval was 0.81 (0.77–0.85) for both. Performance estimates evaluated in the testing set for the split-sample model (AUC = 0.85 [0.82–0.87]) and via cross-validation for the entire-sample model (AUC = 0.83 [0.81–0.85]) accurately reflected prospective performance. Validation of the entire-sample model with bootstrap optimism correction overestimated prospective performance (AUC = 0.88 [0.86–0.89]). Measures of classification accuracy, including sensitivity and positive predictive value at the 99th, 95th, 90th, and 75th percentiles of the risk score distribution, indicated similar conclusions: bootstrap optimism correction overestimated classification accuracy in the prospective validation set. Conclusions While previous literature demonstrated the validity of bootstrap optimism correction for parametric models in small samples, this approach did not accurately validate performance of a rare-event prediction model estimated with random forests in a large clinical dataset. Cross-validation of prediction models estimated with all available data provides accurate independent validation while maximizing sample size

Directory of Open Access Journals

Predicting survival time for metastatic castration resistant prostate cancer: An iterative imputation approach [version 1; referees: 2 approved, 1 approved with reservations]

Author: Detian Deng
Karthik Rao
R. Yates Coley
Yu Du
Yuxin Zhu
Zhenke Wu
Zhicheng Ji
Publication venue: 'F1000 Research Ltd'
Publication date: 01/11/2016
Field of study

In this paper, we present our winning method for survival time prediction in the 2015 Prostate Cancer DREAM Challenge, a recent crowdsourced competition focused on risk and survival time predictions for patients with metastatic castration-resistant prostate cancer (mCRPC). We are interested in using a patient's covariates to predict his or her time until death after initiating standard therapy. We propose an iterative algorithm to multiply impute right-censored survival times and use ensemble learning methods to characterize the dependence of these imputed survival times on possibly many covariates. We show that by iterating over imputation and ensemble learning steps, we guide imputation with patient covariates and, subsequently, optimize the accuracy of survival time prediction. This method is generally applicable to time-to-event prediction problems in the presence of right-censoring. We demonstrate the proposed method's performance with training and validation results from the DREAM Challenge and compare its accuracy with existing methods

Directory of Open Access Journals

Recommended from our members

External Validation of the eRADAR Risk Score for Detecting Undiagnosed Dementia in Two Real-World Healthcare Systems

Author: Barnes Deborah E
Coley R Yates
Dublin Sascha
Fuller Sharon
Idu Abisola E
Karliner Leah
Lam Rosemary
Lee Sei J
Smith Julia J
Publication venue: eScholarship, University of California
Publication date: 01/02/2023
Field of study

BackgroundFifty percent of people living with dementia are undiagnosed. The electronic health record (EHR) Risk of Alzheimer's and Dementia Assessment Rule (eRADAR) was developed to identify older adults at risk of having undiagnosed dementia using routinely collected clinical data.ObjectiveTo externally validate eRADAR in two real-world healthcare systems, including examining performance over time and by race/ethnicity.DesignRetrospective cohort study PARTICIPANTS: 129,315 members of Kaiser Permanente Washington (KPWA), an integrated health system providing insurance coverage and medical care, and 13,444 primary care patients at University of California San Francisco Health (UCSF), an academic medical system, aged 65 years or older without prior EHR documentation of dementia diagnosis or medication.Main measuresPerformance of eRADAR scores, calculated annually from EHR data (including vital signs, diagnoses, medications, and utilization in the prior 2 years), for predicting EHR documentation of incident dementia diagnosis within 12 months.Key resultsA total of 7631 dementia diagnoses were observed at KPWA (11.1 per 1000 person-years) and 216 at UCSF (4.6 per 1000 person-years). The area under the curve was 0.84 (95% confidence interval: 0.84-0.85) at KPWA and 0.79 (0.76-0.82) at UCSF. Using the 90th percentile as the cut point for identifying high-risk patients, sensitivity was 54% (53-56%) at KPWA and 44% (38-51%) at UCSF. Performance was similar over time, including across the transition from International Classification of Diseases, version 9 (ICD-9) to ICD-10 codes, and across racial/ethnic groups (though small samples limited precision in some groups).ConclusionseRADAR showed strong external validity for detecting undiagnosed dementia in two health systems with different patient populations and differential availability of external healthcare data for risk calculations. In this study, eRADAR demonstrated generalizability from a research sample to real-world clinical populations, transportability across health systems, robustness to temporal changes in healthcare, and similar performance across larger racial/ethnic groups

eScholarship - University of California