7 research outputs found

    A discretization method based on maximizing the area under receiver operating characteristic curve

    Get PDF
    Cataloged from PDF version of article.Many machine learning algorithms require the features to be categorical. Hence, they require all numeric-valued data to be discretized into intervals. In this paper, we present a new discretization method based on the receiver operating characteristics (ROC) Curve (AUC) measure. Maximum area under ROC curve-based discretization (MAD) is a global, static and supervised discretization method. MAD uses the sorted order of the continuous values of a feature and discretizes the feature in such a way that the AUC based on that feature is to be maximized. The proposed method is compared with alternative discretization methods such as ChiMerge, Entropy-Minimum Description Length Principle (MDLP), Fixed Frequency Discretization (FFD), and Proportional Discretization (PD). FFD and PD have been recently proposed and are designed for Naive Bayes learning. ChiMerge is a merging discretization method as the MAD method. Evaluations are performed in terms of M-Measure, an AUC-based metric for multi-class classification, and accuracy values obtained from Naive Bayes and Aggregating One-Dependence Estimators (AODE) algorithms by using real-world datasets. Empirical results show that MAD is a strong candidate to be a good alternative to other discretization methods

    Ranking instances by maximizing the area under ROC curve

    Get PDF
    Cataloged from PDF version of article.In recent years, the problem of learning a real-valued function that induces a ranking over an instance space has gained importance in machine learning literature. Here, we propose a supervised algorithm that learns a ranking function, called ranking instances by maximizing the area under the ROC curve (RIMARC). Since the area under the ROC curve (AUC) is a widely accepted performance measure for evaluating the quality of ranking, the algorithm aims to maximize the AUC value directly. For a single categorical feature, we show the necessary and sufficient condition that any ranking function must satisfy to achieve the maximum AUC. We also sketch a method to discretize a continuous feature in a way to reach the maximum AUC as well. RIMARC uses a heuristic to extend this maximization to all features of a data set. The ranking function learned by the RIMARC algorithm is in a humanreadable form; therefore, it provides valuable information to domain experts for decision making. Performance of RIMARC is evaluated on many real-life data sets by using different state-of-the-art algorithms. Evaluations of the AUC metric show that RIMARC achieves significantly better performance compared to other similar methods

    Validation of the EuroSCORE risk models in Turkish adult cardiac surgical population

    Get PDF
    Objective: The aim of this study was to validate additive and logistic European System for Cardiac Operative Risk Evaluation (EuroSCORE) models on Turkish adult cardiac surgical population. Methods: TurkoSCORE project involves a reliable web-based database to build up Turkish risk stratification models. Current patient population consisted of 9443 adult patients who underwent cardiac surgery between 2005 and 2010. However, the additive and logistic EuroSCORE models were applied to only 8018 patients whose EuroSCORE determinants were complete. Observed and predicted mortalities were compared for low-, medium-, and high-risk groups. Results: The mean patient age was 59.5 years (±12.1 years) at the time of surgery, and 28.6% were female. There were significant differences (all p< 0.001) in the prevalence of recent myocardial infarction (23.5% vs 9.7%), moderate left ventricular function (29.9% vs 25.6%), unstable angina (9.8% vs 8.0%), chronic pulmonary disease (13.4% vs 3.9%), active endocarditis (3.2% vs 1.1%), critical preoperative state (9.0% vs 4.1%), surgery on thoracic aorta (3.7% vs 2.4%), extracardiac arteriopathy (8.6% vs 11.3%), previous cardiac surgery (4.1% vs 7.3%), and other than isolated coronary artery bypass graft (CABG; 23.0% vs 36.4%) between Turkish and European cardiac surgical populations, respectively. For the entire cohort, actual hospital mortality was 1.96% (n = 157; 95% confidence interval (CI), 1.70-2.32). However, additive predicted mortality was 2.98% (p< 0.001 vs observed; 95%CI, 2.90-3.00), and logistic predicted mortality was 3.17% (p< 0.001 vs observed; 95%CI, 3.03-3.21). The predictive performance of EuroSCORE models for the entire cohort was fair with 0.757 (95%CI, 0.717-0.797) AUC value (area under the receiver operating characteristic, AUC) for additive EuroSCORE, and 0.760 (95%CI, 0.721-0.800) AUC value for logistic EuroSCORE. Observed hospital mortality for isolated CABG was 1.23% (n = 75; 95%CI, 0.95-1.51) while additive and logistic predicted mortalities were 2.87% (95%CI, 2.82-2.93) and 2.89% (95%CI, 2.80-2.98), respectively. AUC values for the isolated CABG subset were 0.768 (95%CI, 0.707-0.830) and 0.766 (95%CI, 0.705-0.828) for additive and logistic EuroSCORE models. Conclusion: The original EuroSCORE risk models overestimated mortality at all risk subgroups in Turkish population. Remodeling strategies for EuroSCORE or creation of a new model is warranted for future studies in Turkey. © 2011 European Association for Cardio-Thoracic Surgery

    Comparison of original EuroSCORE, EuroSCORE II and STS risk models in a Turkish cardiac surgical cohort

    Get PDF
    OBJECTIVESThe aim of this study was to compare additive and logistic European System for Cardiac Operative Risk Evaluation (EuroSCORE), EuroSCORE II and the Society of Thoracic Surgeons (STS) models in calculating mortality risk in a Turkish cardiac surgical population.METHODSThe current patient population consisted of 428 patients who underwent isolated coronary artery bypass grafting (CABG) between 2004 and 2012, extracted from the TurkoSCORE database. Observed and predicted mortalities were compared for the additive/logistic EuroSCORE, EuroSCORE II and STS risk calculator. The area under the receiver operating characteristics curve (AUC) values were calculated for these models to compare predictive power.RESULTSThe mean patient age was 74.5 ± 3.9 years at the time of surgery, and 35.0% were female. For the entire cohort, actual hospital mortality was 7.9% (n = 34; 95% confidence interval [CI] 5.4-10.5). However, the additive EuroSCORE-predicted mortality was 6.4% (P = 0.23 vs observed; 95% CI 6.2-6.6), logistic EuroSCORE-predicted mortality was 7.9% (P = 0.98 vs observed; 95% CI 7.3-8.6), EuroSCORE II- predicted mortality was 1.7% (P = 0.00 vs observed; 95% CI 1.6-1.8) and STS predicted mortality was 5.8% (P = 0.10 vs observed; 95% CI 5.4-6.2). The mean predictive performance of the analysed models for the entire cohort was fair, with 0.7 (95% CI 0.60-0.79). AUC values for additive EuroSCORE, logistic EuroSCORE, EuroSCORE II and STS risk calculator were 0.70 (95% CI 0.60-0.79), 0.70 (95% CI 0.59-0.80), 0.72 (95% CI 0.62-0.81) and 0.62 (95% CI 0.51-0.73), respectively.CONCLUSIONSEuroSCORE II significantly underestimated mortality risk for Turkish cardiac patients, whereas additive and logistic EuroSCORE and STS risk calculators were well calibrated. © 2013 The Author 2013
    corecore