7,793 research outputs found

    Prediction of delayed graft function after kidney transplantation : comparison between logistic regression and machine learning methods

    Get PDF
    Background: Predictive models for delayed graft function (DGF) after kidney transplantation are usually developed using logistic regression. We want to evaluate the value of machine learning methods in the prediction of DGF. Methods: 497 kidney transplantations from deceased donors at the Ghent University Hospital between 2005 and 2011 are included. A feature elimination procedure is applied to determine the optimal number of features, resulting in 20 selected parameters (24 parameters after conversion to indicator parameters) out of 55 retrospectively collected parameters. Subsequently, 9 distinct types of predictive models are fitted using the reduced data set: logistic regression (LR), linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), support vector machines (SVMs; using linear, radial basis function and polynomial kernels), decision tree (DT), random forest (RF), and stochastic gradient boosting (SGB). Performance of the models is assessed by computing sensitivity, positive predictive values and area under the receiver operating characteristic curve (AUROC) after 10-fold stratified cross-validation. AUROCs of the models are pairwise compared using Wilcoxon signed-rank test. Results: The observed incidence of DGF is 12.5 %. DT is not able to discriminate between recipients with and without DGF (AUROC of 52.5 %) and is inferior to the other methods. SGB, RF and polynomial SVM are mainly able to identify recipients without DGF (AUROC of 77.2, 73.9 and 79.8 %, respectively) and only outperform DT. LDA, QDA, radial SVM and LR also have the ability to identify recipients with DGF, resulting in higher discriminative capacity (AUROC of 82.2, 79.6, 83.3 and 81.7 %, respectively), which outperforms DT and RF. Linear SVM has the highest discriminative capacity (AUROC of 84.3 %), outperforming each method, except for radial SVM, polynomial SVM and LDA. However, it is the only method superior to LR. Conclusions: The discriminative capacities of LDA, linear SVM, radial SVM and LR are the only ones above 80 %. None of the pairwise AUROC comparisons between these models is statistically significant, except linear SVM outperforming LR. Additionally, the sensitivity of linear SVM to identify recipients with DGF is amongst the three highest of all models. Due to both reasons, the authors believe that linear SVM is most appropriate to predict DGF

    Sensor-AssistedWeighted Average Ensemble Model for Detecting Major Depressive Disorder

    Get PDF
    The present methods of diagnosing depression are entirely dependent on self-report ratings or clinical interviews. Those traditional methods are subjective, where the individual may or may not be answering genuinely to questions. In this paper, the data has been collected using self-report ratings and also using electronic smartwatches. This study aims to develop a weighted average ensemble machine learning model to predict major depressive disorder (MDD) with superior accuracy. The data has been pre-processed and the essential features have been selected using a correlation-based feature selection method. With the selected features, machine learning approaches such as Logistic Regression, Random Forest, and the proposedWeighted Average Ensemble Model are applied. Further, for assessing the performance of the proposed model, the Area under the Receiver Optimization Characteristic Curves has been used. The results demonstrate that the proposed Weighted Average Ensemble model performs with better accuracy than the Logistic Regression and the Random Forest approaches

    Utilizing Data Mining Techniques and Ensemble Learning to Predict Development of Surgical Site Infections in Gynecologic Cancer Patients

    Get PDF
    Surgical site infections are costly to both patients and hospitals, increase patient mortality, and are the most common form of a hospital acquired infection. Gynecological cancer surgery patients are already at higher risk of developing an infection due to the suppression of their immune system. This research leverages popular data mining techniques to create a prediction model to identify high risk patients. Implemented techniques include logistic regression, naive Bayes, recursive partitioning and regression trees, random forest, feed forward neural network, k-nearest neighbor, and support vector machines with linear kernel. Weighted stacked generalization was implemented to improve upon the individual base level model’s performance. The chosen meta level classifiers were support vector machines with linear kernel, logistic regression, and k-nearest neighbor. The result is a model that identifies high-risk patients immediately following a surgical procedure with an AUC of 0.6864, accuracy of 0.6744, sensitivity of 0.7, and specificity of 0.6728

    Predictive modeling of housing instability and homelessness in the Veterans Health Administration

    Full text link
    OBJECTIVE: To develop and test predictive models of housing instability and homelessness based on responses to a brief screening instrument administered throughout the Veterans Health Administration (VHA). DATA SOURCES/STUDY SETTING: Electronic medical record data from 5.8 million Veterans who responded to the VHA's Homelessness Screening Clinical Reminder (HSCR) between October 2012 and September 2015. STUDY DESIGN: We randomly selected 80% of Veterans in our sample to develop predictive models. We evaluated the performance of both logistic regression and random forests—a machine learning algorithm—using the remaining 20% of cases. DATA COLLECTION/EXTRACTION METHODS: Data were extracted from two sources: VHA's Corporate Data Warehouse and National Homeless Registry. PRINCIPAL FINDINGS: Performance for all models was acceptable or better. Random forests models were more sensitive in predicting housing instability and homelessness than logistic regression, but less specific in predicting housing instability. Rates of positive screens for both outcomes were highest among Veterans in the top strata of model‐predicted risk. CONCLUSIONS: Predictive models based on medical record data can identify Veterans likely to report housing instability and homelessness, making the HSCR screening process more efficient and informing new engagement strategies. Our findings have implications for similar instruments in other health care systems.U.S. Department of Veterans Affairs (VA) Health Services Research and Development (HSR&D), Grant/Award Number: IIR 13-334 (IIR 13-334 - U.S. Department of Veterans Affairs (VA) Health Services Research and Development (HSRD))Accepted manuscrip
    corecore